APR-24-2006 10:39AM FROM-JONES DAY 



6507393000 



T-631 P. 004 



F-809 



SEQUENCE LISTING 



Cl> GENERAL INFORMATION: 

(i) APPLICANT: Thomas, Winston J. 

Drayna, Dennis T. 
Feder, John N. 
Gnirke, Andreas 
Ruddy, David 
Tsuchihashi, Zenta 
Wolff, Roger K- 

(Xi) TITLE OF INVENTION: PLASMIDS COMPRISING NUCLEIC ACIDS FROM THE 
HEREDITARY HEOMCHROMATOSIS GENE 

(iii) NUMBER OF SEQUENCES: 76 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Jones Day 

(B) STREET: 222 East 41 st Street 

(C) CITY: New York 

(D) STATE: New York 

(E) COUNTRY: USA 

(F) ZIP : 10017 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: Windows 95 

(D) SOFTWARE: FastSEQ for Windows Version 2.0b 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 09/497,957 

(B) FILING DATE: 04-FEB-2000 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/834,497 

(B) FILING DATE: 04-APR-1997 

( C ) CLAS SIFI CAT ION : 



(Viii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/652,265 

(B) FILING DATE: 23-MAY^1996 

(C) CLASSIFICATION: 

(ix) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 09/632,673 

(B) FILING DATE: 16-APR-1996 

(C) CLASSIFICATION: 

(x) PRIOR APPLICATION DATA: 

CA) APPLICATION NUMBER: US 08/630,912 

(B) FILING DATE: 04-APR-1996 

(C) CLASSIFICATION: 

(xi) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: George, Nikolaos C. 

(B) REGISTRATION NUMBER: 39,201 
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{C) REFERENCE/ DOCKET NUMBER: 8907-087-999 

(xii) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 650-739-3939 

(B) TELEFAX: 650-739-3900 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10825 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: j oin (361 . . 436, 3762--402S, 4235.-4510, 5606.. 5881, 

604Q--6153, 7107. .7147) 
(D) OTHER INFORMATION: /product= "Hereditary Hemochromatosis 

(HH) protein" 

/note= "Normal or wild-type (unaffected) 
Hereditary Hemochromatosis (HH) gene 
allele" 

(ix) FEATURE: 

(A) NAME /KEY: - 

(B) LOCATION: 140.. 7319 

(D) OTHER INFORMATION: /note- "start and stop positions for 

normal or wild-type (unaffected) allele 
CDNA (SEQ ID NO: 9) " 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 3652.. 3891 

(D) OTHER INFORMATION: /note= "start and stop positions for 

normal or wild-type (unaffected) genomic 
sequence surrounding variant for 24d2(C) 
allele (SEQ ID NO: 41)" 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 5507.. 6023 

(D) OTHER INFORMATION: /note= "start and stop positions for 

normal or wild-type (unaffected) genomic 
sequence surrounding variant for 24dl (G) 
allele (SEQ ID*NO:20)" 

(ix) FEATURE: 

(A) NAME/KEY: allele 

(B) LOCATION: replace (3872, "c") 

(D) OTHER INFORMATION: /phenotype- "normal or wild- type 

(unaffected) " 
/label- 24d2 

(ix) FEATURE: 

(A) NAME /KEY: allele 
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(B) LOCATION: replace (3878 , "a") 

(D) OTHER INFORMATION: /phenotype= "normal or wild-type 

(unaffected) " 
/labels 24d7 

(ix) FEATURE : 

(A) NAME /KEY: allele 

(B) LOCATION: replace (5834 , "g") 

(D) OTHER INFORMATION: /phenotype» "normal or wild— cype 

(unaffected) " 
/label- 24dl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO;l: 

TCTAAGGTTG AGATAAAATT TTTAAATGTA TGATTGAATT TTGAAAATCA TAAATATTTA 60 

AATATCTAAA GTTCAGATCA GAACATTGCG AAGCTACTTT CCCCAATCAA CAACACCCCT 120 

T C AG GAT T T A AAAACCAAGG GGGACACTGG ATCACCTAGT GTTTCACAAG CAGGTACCTT 180 

CTGCTGTAGG AGAGAGAGAA CTAAAGTTCT GAAAGACCTG TTGCTTTTCA CCAGGAAGTT 240 

TTACTGGGCA TCTCCTGAGC CTAGGCAATA GCTGTAGGGT GACTTCTGGA GCCATCCCCG 300 

TTTCCCCGCC CCCCAAAAGA AGCGGAGATT TAACGGGGAC GTGCGGCCAG AGCTGGGGAA 360 

ATG GGC CCG CGA GCC AGG CCG GCG CTT CTC CTC CTG ATG CTT TTG CAG 403 
Met Gly Pro Arg Ala Arg Pro Ala Leu Leu Leu Leu Met: Leu Leu Gin 
1 5 10 15 



ACC GCG GTC CTG CAG GGG CGC TTG CTG C GTGAGTCCGA GGGCTGCGGG 4 56 

Thr Ala Val Leu Gin Gly Arg Leu Leu 





20 




25 








CGAACTAGGG 


GCGCGGCGGG 


GGTGGAAAAA 


TCGAAACTAG 


CTTTTTCTTT 


GCGCTTGGGA 


516 


GTTTGCTAAC 


TTTGGAGGAC 


CTGCTCAACC 


CTATCCGCAA 


GCCCCTCTCC 


CTACTTTCTG 


576 


CGTCCAGACC 


CCGTGAGGGA 


GTGCCTACCA 


CTGAACTGCA 


GATAGGGGTC 


CCTCGCCCCA 


636 


GGACCTGCCC 


CCTCCCCCGG 


CTGTCCCGGC 


TCTGCGGAGT 


GACTTTTGGA 


ACCGCCCACT 


696 


CCCTTCCCCC 


AACTAGAATG 


CTTTTAAATA 


AATCTCGTAG 


TTCCTCACTT 


GAGCTGAGCT 


756 


AAGCCTGGGG 


CTCCTTGAAC 


CTGGAACTCG 


GGTTTATTTC 


CAATGTCAGC 


TGTGCAGTTT 


816 


TTTCCCCAGT 


CATCTCCAAA 


CAGGAAGTTC 


TTCCCTGAGT 


GCTTGCCGAG 


AAGGCTGAGC 


876 


AAACCCACAG 


CAGGATCCGC 


ACGGGGTTTC 


CACCTCAGAA 


CGAATGCGTT 


GGGCGGTGGG 


' 936 


GGCGCGAAAG 


AGTGGCGTTG 


GGGATCTGAA 


TTCTTCACCA 


TTCCACCCAC 


TTTTGGTGAG 


996 


ACCTGGGGTG 


GAGGTCTCTA 


GGGTGGGAGG 


CTCCTGAGAG 


AGGCCTACCT 


CGGGCCTTTC 


1056 


CCCACTCTTG 


GCAATTGTTC 


TTTTGCCTGG 


AAAATTAAGT 


ATATGTTAGT 


TTTGAACGTT 


1116 


TGAACTGAAC 


AATTCTCTTT 


TCGGCTAGGC 


TTTATTGATT 


TGCAATGTGC 


TGTGTAATTA 


1176 


AGAGGCCTCT 


CTACAAAGTA 


CTGATAATGA 


ACATGTAAGC 


AAT GCACTCA 


CTTCTAAGTT 


1236 
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ACATTCATAT 


CTGATCTTAT 


TTGATTTTCA 


CTAGGCATAG 


GGAGGTAGGA 


GCTAATAATA 


1296 


CGTTTATTTT 


ACTAGAAGTT 


AACXGGAATT 


CAGATTATAX 


AACTCTTTTC 


AGGTTACAAA 


1356 


GAACATAAAT 


AATCTGGTTT 


TCTGATGTTA 


TTTCAAGTAC 


TACAGCTGCT 


TCTAATCTTA 


1416 


GTTGACAGTG 


ATTTTGCCCT 


GtAGTGTAGC 


ACAGTGTTCT 


GTGGGTCACA 


CGCCGGCCTC 


1476 


AGCACAGCAC 


TTTGAGTTTT 


GGTACTACGT 


GTATCCACAT 


TTTACACATG 


ACAAGAATGA 


1536 


GGCATGGCAC 


GGCCTGCTTC 


CTGGCAAATT 


TATTCAATGG 


TACACTGGGC 


TTTGGTGGCA 


1596 


GAGCT CAT GT 


CTCCACTTCA 


TAGCTATGAT 


TCTTAAACAT 


CACACTGCAT 


TAGAGGTTGA 


1656 




TTCATGTTGA 


GCAGAAATAT 


TCATTGTTTA 


CAAGTGTAAA 


TGAGTCCCAG 


1716 




CACTGTTCAA 


GCCCCAAGGG 


AGAGAGCAGG 


GAAACAAGTC 


TTTACCCTTT 


1776 


R A T ATT T T G C 

Oii J. i» X x x x « 


ATTCTAGTGG 


GAGAGATGAC 


AATAAGCAAA 


TGAGCAGAAA 


GAT AT AC AAC 


1836 


ATCAGGAAAT 


CATGGGTGTT 


GTGAGAAGCA 


GAGAAGTCAG 


GGCAAGTCAC 


TCTGGGGCTG 


1896 


ACACTTGAGC 


AGAGACATGA 


AGGAAATAAG 


AATGATATTG 


ACTGGGAGCA 


GTATTTCCCA 


1956 


GGC AAACT G A 


GTGGGCCTGG 


CAAGTTGGAT 


TAAAAAGCGG 


GTTTTCTCAG 


CACTACTCAT 


2016 


GTGTGTGTGT 


GTGGGGGGGG 


GGGGCGGCGT 


GGGGGTGGGA 


AGGGGGACTA 


CCATCTGCAT 


2076 


GTAGGATGTC 


TAGCAGTATC 


CTGTCCTCCC 


TACTCACTAG 


GTGCTAGGAG 


CACTCCCCCA 


2136 


GTCTTGACAA 


CCAAAAATGT 


CTCTAAACTT 


TGCCACATGT 


CACCTAGTAG 


ACAAACTCCT 


2196 


GGTTAAGAAG 


CTCGGGTTGA AAAAAATAAA 


CAAGTAGTGC 


TGGGGAGTAG 


AGGCCAAGAA 


2256 


GTAGGTAATG 


GGCTCAGAAG 


AGGAGCCACA 


AACAAGGTTG 


TGCAGGCGCC 


TGTAGGCTGT 


2316 


GGTGTGAATT 


CTAGCCAAGG 


AGTAACAGTG 


ATCTGTCACA 


GGCTTTTAAA 


AGATTGCTCT 


2376 


GGCTGCTATG 


TGGAAAGCAG 


AATGAAG6GA 


GCAACAGTAA 


AAGCAGGGAG 


CCCAGCCAGG 


2436 


AAGCT GTTAC 


ACAGTCCAGG 


CAAGAGGTAG 


TGGAGTGGGC 


TGGGTGGGAA 


CAGAAAAGGG 


2496 


AGTGACAAAC 


CATTGTCTCC 


TGAATATATT 


CTGAAGGAAG 


TTGCTGAAGG 


ATTCTATGTT 


2556 


GTGTGAGAGA AAGAGAAGAA 


TTGGCTGGGT 


GTAGTAGCTC 


ATGCCAAGGA 


GGAGGCCAAG 


2616 


GAGAGCAGAT 


TCCTGAGCTC 


AGGAGTTCAA 


GACCAGCCTG 


GGCAACACAG 


CAAAACCCCT 


2676 


TCTCTACAAA AAATACAAAA ATTAGCTGGG 


TGTGGTGGCA 


TGCACCTGTG 


ATCCTAGCTA 


2736 


CTCGGGAGGC 


TGAGGTGGAG 


GGTATTGCTT .GAGCCCAGGA AGTTGAGGCT 


GCAGTGAGCC 


2796 


ATGACTGTGC 


CACTGTACTT 


CAGCCTAGGT 


GACAGAGCAA 


GACCCTGTCT 




2B56 


CCTGAAAAAG 


AGAAGAGTTA 


AAGTTGACTT 


TGTTCTTTAT 


TTTAATTTTA 


TTGGCCTGAG 


2916 


CAGTGGGGTA 


ATTGGCAATG 


CCATTTCTGA 


GATGGTGAAG 


GCAGAGGAAA 


GAGCAGTTTG 


2976 


GGGTAAATCA AGGATCTGCA 


TTTGGGACAT 


GTTAAGTTTG 


AGATTCCAGT 


CAGGCTTCCA 


3036 


AGTGGTGAGG 


CCACATAGGC 


AGTTCAGTGT 


AAGAATTCAG 


GACCAAGGCT 


GGGCACGGTG 


3096 
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GCTCACTTCT 


GTAATCCCAG 


CACTTTGGTG 


GCTGAGGCAG 


GTA6ATCATT 


TGAGGTCAGG 


3156 


AGTTTGAGAC 


AAGCTTGGCC 


AACATGGTGA 


AACCCCATGT 


CTACTAAAAA 


TACAAAAATT 


3216 


AGCCTGGTGT 


GGTGGCGCAC 


GCCTATAGTC 


CCAGGTTTTC 


AGGAGGCTTA 


GGTAGGAGAA 


3276 


TCCCTTGAAC 


CCAGGAGGTG 


CAGGTTGCAG 


TGAGCTGAGA 


TTGTGCCACT 


GCACTCCAGC 


3336 


CTGGGTGATA 


GAGTGAGACT 


CTGTCTCAAA 


AAAAAAAAAA 


AAAAAAAAAA 


AAAAAACTGA 


3396 


AGGAATTATT 


CCT C AGGAT T 


TGGGTCTAAT 


TTGCCCTGAG 


CACCAACTCC 


TGAGTTCAAC 


3456 


TACCATGGCT 


AGACACACCT 


TAACATTTTC 


TAGAATCCAC 


CAGCTTTAGT 


GGAGTCTGTC 


3516 


TAATCATGAG 


TATTGGAATA 


GGATCTGGGG 


GCAGTGAGGG 


GGTGGCAGCC 


ACGTGTGGCA 


357 6 


GAGAAAAGCA 


CACAAGGAAA 


GAGCACCCAG 


GACTGTCATA 


TGGAAGAAAG 


ACAGGACTGC 


3636 


AACTCACCCT 


TCACAAAATG 


AGGACCAGAC 


ACAGCtGATG 


GTATGAGTTG 


ATGCAGGTGT 


3696 


GTGGAGCCTC 


AACATCCTGC 


TCCCCTCCTA 


CTACACATGG 


TTAAGGCCTG 


TTGCTCTGTC 


3756 


TCCAG GT TCA CAC TCT 
Arg Ser His Ser 


CTG CAC TAC CTC TTC ATG GGT GCC TCA GAG 
Leu His Tyr Leu Phe Met Gly Ala Ser Glu 


3802 



30 35 

CAG GAC CTT GGT CTT TCC TTG TTT GAA GCT TTG GGC TAC GTG GAT GAC 3850 
Gin Asp Leu Gly Leu Ser Lou Phe Glu Ala Leu Gly Tyr Val Asp Asp 
40 45 50 SS 

CAG CTG TTC GTG TTC TAT GAT CAT GAG AGT CGC CGT GTG GAG CCC CGA 3898 
Gin Leu Phe Val Phe Tyr Asp His Glu Ser Arg Arg Val Glu Pro Arg 
60 65 „ 70 

ACT CCA TGG GTT TCC AGT AGA ATT TCA AGC CAG ATG TGG CTG CAG CTG 394 6 

Thr Pro Trp Val Ser Ser Arg lie Ser Ser Gin Met Trp Leu Gin Leu 
75 80 85 

Ser Gin Ser Leu Lys Gly Trp Asp His Met Phe Thr val Asp Phe Trp 
90 95 100 

ACT ATT ATG GAA AAT CAC AAC CAC AGC AAG G GTATGTGGAG AGGGGGCCTC 4 045 

Thr He Met Glu Asn His Asn Hi3 Ser Lys 
105 110 

ACCTTCCTGA GGTTGTCAGA GCTTTTCATC TTTTCATGCA TCTTGAAGGA AACAGCTGGA 4105 

AGTCTGAGGT CTTGTGGGAG CAGGGAAGAG GGAAGGAATT TGCTTCCTGA GATCATTTGG 4165 

TCCTTGGGGA TGGTGGAAAT AGGGACCTAT TCCTTTGGTT GCAGTTAACA AGGCTGGGGA 4225 

TTTTTCCAG AG TCC CAC ACC CTG CAG GTC ATC CTG GGC TGT GAA ATG 4272 
Glu Ser His Thr Leu Gin Val He Leu Gly Cys Glu Met 
115 120 125 

CAA GAA GAC AAC AGT ACC GAG GGC TAC TGG AAG TAC GGG TAT GAT GGG 4320 
Gin Glu Asp Asn Ser Thr Glu Gly Tyr Trp Lys Tyr Gly Tyr Asp Gly 
130 135 140 

CAG GAC CAC CTT GAA TTC TGC CCT GAC ACA CTG GAT TGG AGA GCA GCA 4368 
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Gln Asp His Leu Glu Phe Cys Pro Asp Thr Leu Asp Trp Arg Ala Ala 
145 150 155 

Glu Pro Arg Ala Trp Pro Thr Lys Leu Glu Trp Glu Arg His Lys lie 
160 165 170 

CGG GCC AGG CAG AAC AGG GCC TAC CTG GAG AGG GAC TGC CCT GCA CAG 44 64 

Arg Ala Arg Gin Asn Arg Ala Tyr Leu Glu Arg Asp Cys Pro Ala Gin 
175 180 185 190 



CTG CAG CAG TTG CTG GAG CTG GGG AGA GGT GTT TTG GAC CAA CAA G 4 510 

Leu Gin Gin Leu Leu Glu Leu Gly Arg Gly Val Leu Asp Gin Gin 





1 »3 








7 D ^ 




GTATGGTGGA 


AAC AC ACT TC 






jv. p tv r* i* h r* 7A P P 


7A P P T T f5 P ZV R ft 


4 *i7 0 


GCACGGAATC 


CCTGGTTGGA 


GTTTCAGAGG 


m /% m 7V /-• f+ 
I GGG i GAGGG 


lulbl GGG 1 G 




4630 


GGGAAGGGAC 


TTTCTCAATC 


GTAGAG 1 G J. G 


T 71 PPnTTI T| TTT T\ T\ 

I AC C T T AT AA 


IlijAtjAlblA 


f 1 p & p a p n p n 




ACAAGTCATG 


GGTTTAATTT 


CTTTTCTCCA 


XGCATATGGG 


T P 7A 7A TA C H P A 71 




4750 


CCCTTGCTTT 


TTATTTAACC 


AATAA1 G 1 1 X 


TGTATATTTA 


TALt* 1 G i i AA 


nan rprp ^7*^7171 
Ann 1 1 *„Av*«-tt. 


4 O X V 


ATGTCAAGGC 


CGGGCACggT 


GGG 1 GAGGGG 


TGI AA I GG GA 


GGAG1 1 1 GGG 


7ipppp^7AP,r;P 

AGG v G>*.t3 va 




GGGTGGTCAC 


AAGG T CAGGA 


GTTTGAGACC 


AGCCTGACCA 


ACATGGTGAA 


t\ r" P P P T* 
ALLLb 1 G i G 1 




AAAAAAATAC 


AAAAAT T AGC 


TGGTCACAGT 


CATGCGCACC 


TGTAGTCCCA 


GCTAATU GGA 




AGGCTGAGGC 


AGGAGCATCG 


CTTGAACCTG 


GGAAGCG GAA 


GTTGCACTGA 


GCC AAG AT CG 


5050 


CGCCACTGCA 


CTCCAGCCTA 


GGCAGCAGAG 


TGAGACTCCA 


TCTTAAAAAA 


AAAAAAAAAA 


5110 


AAAAAAAGAG 


AATTCAGAGA 


TCTCAGCTAT 


CATATGAATA 


CCAGGACAAA 


ATATCAAGTG 


5170 


AGGCCACTTA 


TCAGAGTAGA 


AGAATCCTTT 


AGGTTAAAAG 


TTTCTTTCAT 


AGAACATAGC 


5230 


AATAATCACT 


GAAGCTACCT 


ATCTTACAAG 


TCCGCTTCTT 


ATAACAATGC 


CTCCTAGGTT 


5290 


GACCCAGGTG 


AAACTGACCA 


TCTGTATTCA 


ATCATTTTCA 


ATGCACATAA 


AGGGCAATTT 


5350 


TATCTATCAG 


AACAAAGAAC 


ATGGGTAACA 


GATAT GTATA 


TTTACATGTG 


AGGAGAACAA 


5410 


GCTGATCTGA 


CTGCTCTCCA 


AGTGACACTG 


TGTTAGAGTC 


CAATCTTAGG 


ACACAAAATG 


S470 


GTGTCTCTCC 


TGTAGCTTGT 


TTTTTTCTGA 


AAAGGGTATT 


TCCTTCCTCC 


AACCTATAGA 


5530 


AGGAAGTGAA 


AGTTCCAGTC 


TTCCTGGCAA 


GGGTAAACAG 


ATCCCCTCTC 


CTCATCCTTC 


5590 


CTCTTTCCTG 


TCAAG TG CCT CCT TTG 
Val Pro Pro Leu 


GTG AAG GTG ACA CAT CAT GTG ACC 
Val Lys Val Thr His His Val Thr 


5640 



210 215 

TCT TCA GTG ACC ACT CTA CGG TGT CGG GCC TTG AAC TAC TAC CCC CAG 5688 
Ser Ser Val Thr Thr Leu Arg Cys Arg Ala Leu Asn Tyr Tyr Pro Gin 
220 225 230 

AAC ATC ACC ATG AAG TGG CTG AAG GAT AAG CAG CCA ATG GAT GCC AAG 5736 
Asn lie Thr Met Lys Trp Leu Lys Asp Lys Gin Pro Met Asp Ala Lys 
235 240 245 
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GAG TTC GAA CCT AAA GAC GTA TTG CCC AAT GGG GAT GGG ACC TAC CAG 5784 
Glu Phe Glu Pro Lys Asp Val Leu Pro Asn Gly Asp Gly Thr Tyr Gin 
250 255 '260 265 

GGC TGG ATA ACC TTG GCT GTA CCC CCT GGG GAA GAG CAG AGA TAT ACG 5832 
Gly Trp lie Thr Leu Ala. Val Pro Pro Gly Glu Glu Gin Arg Tyr Thr 
270 275 280 

TGC CAG GTG GAG CAC CCA GGC CTG GAT CAG CCC CTC ATT GTG ATC TGG G 5881 
Cys Gin Val Glu His Pro Gly Leu Asp Gin Pro Leu lie Val He Trp 
285 290 295 

G TAT GTG ACT GATGAGAGCC AGGAGCTGAG AAAATCTATT GGGGGTTGAG AGGAGTGCCT 5941 

GAGGAGGTAA TTATGGCAGT GAGATGAGGA TCTGCTCTTT GTTAGGGGGT GGGCTGAGGG 6001 

TGGCAATCAA AGGCTTTAAC TTGCTTTTTC TGTTTTAG AG CCC TCA CCG TCT 6053 

Glu Pro Ser Pro Ser 
300 

GGC ACC CTA GTC ATT GGA GTC ATC AGT GGA ATT GCT GTT TTT GTC GTC 6101 
Gly Thr Leu Val He Gly Val He Ser Gly He Ala Val Phe Val Val 
305 310 315 

ATC TTG TTC ATT GGA ATT TTG TTC ATA ATA TTA AGG AAG AGG CAG GGT 614 9 

He Leu Phe He Gly He Leu Phe He He Leu Arg Lys Arg Gin Gly 
320 325 330 

TCA A GTGAGTAGGA ACAAGGGGGA AGTCTCTTAG TACCTCTGCC CCAGGGCACA 6203 
Ser 



335 














GTGGGAAGAG 


GGGCAGAGGG 


GATCTGGCAT 


CCATGGGAAG 


CATTTTTCTC 


ATTTATATTC 


6263 


TTTGGGGACA 


CCAGCAGCTC 


CCTGGGAGAC 


AGAAAATAAT 


GGTTCTCCCC 


AGAATGAAAG 


6323 


TCTCTAATTC 


AACAAACATC 


TTCAGAGCAC 


CTACTATTTT 


GCAAGAGCTG 


TTTAAGGTAG 


6383 


TACAGGGGCT 


TTGAGGTTGA 


GAAGTCACTG 


TGGCTATTCT 


CAGAACCCAA 


ATCTGGTAGG 


6443 


GAATGAAATT 


GATAGCAAGT 


AAATGTAGTT 


AAAGAAGACC 


CCATGAGGTC 


CTAAAGCAGG 


6503 


CAGGAAGCAA 


ATGCTTAGGG 


TGTCAAAGGA 


AAGAATGATC 


ACATTCAGCT 


GGGGATCAAG 


6563 


ATAGCCTTCT 


GGATCTTGAA 


GGAGAAGCTG 


GATTCCATTA 


GGTGAGGTTG 


AAGATGATGG 


6623 


GAGGTCTACA 


CAGACGGAGC 


AACCATGCCA 


AGTAGGAGAG 


TATAAGGCAT 


ACT GGG AG AT 


6683 


TAGAAATAAT 


TACTGTACCT 


TAACCCTGAG 


TTTGCGTAGC 


TATCACTCAC 


CAATTATGCA 


6743 


TTTCTACCCC 


CTGAACATCT 


GTGGTGTAGG 


GAAAAGAGAA 


TCAGAAAGAA 


GCCAGCTCAT 


6803 


ACAGAGTCCA 


AGGGTCTTTT 


GGGATATTGG 


GTTATGATCA 


CTGGGGTGTC 


ATTGAAGGAT 


6863 


CCTAAGAAAG 


GAGGACCACG 


ATCT CCCTTA 


TATGGTGAAT 


GTGTTGTTAA 


GAAGTTAGAT 


6923 


GAGAGGTGAG 


GAG ACC AGT T 


AGAAAGCCAA 


TAAGCATTTC 


CAGATGAGAG 


AT AAT GGT TC 


6983 


TTGAAATCCA 


ATAGTGCCCA 


GGTCTAAATT 


GAGATGGGTG 


AATGAGGAAA 


ATAAGGAAGA 


7043 
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GAGAAGAGGC AAGATGGTGC CTAGGTTTGT GATGCCTCTT TCCTGGGTCT CTTGTCTCCA 7103 

CAG GA GGA GCC ATG GGG GAC TAC GTC TTA GCT GAA CGT GAG 714 4 
Arg Gly Ala Met Gly His Tyr Val Leu Ala Glu Arg Glu 
340 345 



1 GACACGGAG 




r A r*T G T GGG A 


AG GAG AC AAA 


ACTAGAGACT 


CAAAGAGGGA 


7204 


GTGCATTTAT 


bnb^ 1 C X 1 Wl 




a f27x RTTf3 a a 


tt x /liUi^n x 


GAAAT T GCCT 


7264 


GACGAACTCC 


T T GAT T TT AG 


CCX lCXCiWX 


X LA XXX CC X C 


AAAAAWi-H. X X X 




7324 


GTTTCTGAGT 


TCCTGCA 1 GC 


Lbu X ua 1 CC C 


X Ad?C 1 \7 X wiL 




GAArTGTCTC 


7384 


TCATGAACCT 


CAAGC 1 GCAX 


C InunbuL X X 


CC 1 1 CAX X It 


L X LUU X Cnkr^r 


TPAGaGAf*AT 


744 4 


AC ACCT A 1 G i 


1^/4.1 X X \*£\ 1 X 1 


LL lnl J, i A ID 


vjitAvanu ual x 


^fTTAAATTT 


GGGGGACTTA 


7504 


CATGArrCftT 


ntrnrp 74 74 ^ 71 OP ^fTl 

111 nn\*A 1 C 1 






X UV3U x 


AGTCATAACC 


7564 


T TAC CAG ATT 


TTTACACA 1 G 


t a T n T A T G f" A 




vwu x x win'w- x 


XXX v^<»— » x j. ± \sr\ 


7624 


ATCCTCTCTC 


1 GTGT TAC CC 


nbi AAv <L CAX 


C X Lj 1 LAuLnn 


Ukolw X 1 


TTrTTfTATC 
iXvii v*v»ri x 


7 684 


TG AT T GT GAT 


0 1 ijAva i 1 uLii 


LAov X A X uAA 


r^fsr' t/it a p a n 


X UVw^iU x \3 


GAAC5AGGCAC 


7744 


CTgTCCCAGA 


71 T» a n f** 7a In #*■ a 
AAAAGC AT CA 


1 ^l^JtAlt lu 


X ijuulnblnl 


rtaTl^^fTTf^TT 
onlwu^l, x 


X X X AUVrfXVVSU X 


7804 


AGGAGGGAAA 


X AX C 1 1 GAAA 


crr;cn t t tz t n 




X X ll«* X-frLfiX. 1 VJ 


Gf ATGAAGGT 


7864 


GTCATACAGA 


TTTGCAAAG 1 


X lau 1 oL 


C X X CAX 1 1 oG 


IjAX tfl*- X Ak* X O 




7 924 


GACCTGAAGA 


AT CAC aATAA 


TTTTCTACCT 


GGTcTCTcCT 


fP^iTi r n/^ r r , r^TV'Ti7v 

TGTTCTGATA 


74fP/ n, 74747474 r T fr T , Il 

A! GAAAAX X A 


7QO/ 


TGATAAGGAT 


GATAAAAGCA 


CTTAC1TCGT 


GTCCGACTCT 


pn /^rp /»" TV TV 

T CT GAGC AC C 


lAC-1 1ACAXG 


A A A A 


CATT AC T GGA 


TGCACXTCTT 


TV TV TATS 

AC AAT AAT T C 


TAT GAGA1 AG 


G1ACXA1 XA1 




ft! 


TTTTTAAATG 


ti » /*»tv t\ tv t* 

AAGAAAGTGA 


AGT AG CjC C G G 


^ 7i ^* t» ^ 
GCACGG 1 GGC 


TCACGCC 1 GX 


AA ILL LauLA 




CTTTGGGAGG 


CC AAAGC la G G 




GG 1 CAGGAGA 


1 CGAGAd^AX 


LL X ovjL 1 AnL 


8224 


TV ITV/"* /^rn /"^ TV TV TV /"^ 

ATG G T GAAAC 


CCCAlCICi A 




aaaaaaTTar* 

AAAAAAX 1 AG 


X cijuLu X VJ\J 


TGGf^AGACGC 


8284 


CTGT AGT CCC 


AGCTACTCGG 


AAbbu 1 GAGG 


CAGG AGAA X G 






8344 


GCTTGCAGTG 


AGCCGAGTTT 


GCGCCACTGC 


ACTCCAGCCT 


AGGTGACAGA 


GTGAGACTCC 


8404 


ATCTCAAAAA 


AATAAAAATA 


AAAATAAAAA 


AATGAAAAAA 


AAAAGAAAGT 


GAAGTATAGA 


8464 


GTATCTCATA 


GTTTGTCAGT 


GATAGAAACA 


GGTTTCaaAC 


. TCAGTCAATC 


TGACCGTTTG 


8524 


ATACATCTCA 


GACACCACTA 


CAT TC AGT AG 


TTTAGATGCC 


TAGAATAAAT 


AGAGAAGGAA 


8584 


GGAGATGGCT 


CTTCTCTTGT 


CTCATTGTGT 


TTCTTCTGAG 


TGAGCTl^GAA 


TCACATGAAG 


8644 


GGGAACAGCA 


GAAAACAACC 


AACTGATCCT 


CAGCTGTCAT 


GTTTCCTTTA 


AAAGTCCCTG * 


8704 


AAGGAAGGTC 


CTGGAATGTG 


ACTCCCTTGC 


TCCTCTGTTG 


CTCTCTTTGG 


CATTCATTTC 


8764 


TTTGGACCCT 


ACGCAAGGAC 


TGTAATTGGT 


GGGGACAGCT 


AGTGGCCCTG 


CTGGGCTTCA 


8824 
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CACACGGTGT 


CCTCCCTAGG 


CCAGTGCCTC 


TGGAGT CAG A 


ACTCTGGTGG 


TATTTCCCTC 


8884 


AATGAAGTGG 


AGTAAGCTCT 


CTCATTTTGA 


GATGGTATAA 


TGGAAGCCAC 


CAAGTGGCTT 


8944 


AGAGGATGCC 


CAGGTCCTTC 


CAT G GAG CCA 


CTGGGGTTCC 


GGTGCACATT 


AAAAAAAAAA 


9004 


TCTAACCAGG 


ACATTCAGGA 


ATTGCTAGAT 


TCTGGGAAAT 


CAGTTCACCA 


TGTTCAAAAG 


9064 


AGTCTTTTTT 


TTTTTTTTGA 


GACTCTATTG 


CCCAGGCTGG 


AGTGCAATGG 


CATGATCTCG 


9124 


GCTCACTGTA 


ACCTCTGCCT 


CCCAGGTTCA 


AGCGATTCTC 


CTGTCTCAGC 


CTCCCAAGTA 


9184 


GCTGGGATTA 


CAGGCGTGCA 


CCACCATGCC 


CGGCTAATTT 


TTGTATTTTT 


AGTAGAGACA 


9244 


GGGTTTCACC 


ATGTTGGCCA 


GGCTGGTCTC 


GAACTCTCCT 


GACCTCGTGA 


TCCGCCTGCC 


9304 


TCGGCCTCCC 


AAAGTGCTGA 


GATTACAGGT 


GTGAGCCACC 


CTGCCCAGCC 


GTCAAAAGAG 


9364 


TCTT AETATA 


TATATCCAGA 


TGGCATGTGT 


TTACTTTATG 


TTACTACATG 


CACTTGGCTG 


9424 


CATAAATGTG 


GTACAAGCAT 


TCTGTCTTGA 


AGGGCAGGTG 


CTTCAGGATA 


CCATATACAG 


9484 


CTCAGAAGTT 


TCTTCTTTAG 


GCATTAAATT 


TTAGCAAAGA 


TATCT CATCT 


CTTCTTTTAA 


9544 


ACCATTTTCT 


TTTTTTGTCG 


TTAGAAAAGT 


TATGTAGAAA 


AAAGTAAATG 


TGATTTACGC 


9604 


TCATTGTAGA 


AAAGCTATAA 


AATGAATACA 


ATTAAAGCTG 


TTATTTAATT 


AGCCAGTGAA 


9664 


AAACTATTAA 


CAACTTGTCT 


ATTACCTGTT 


AGTATTATTG 


TTGCATTAAA 


AATGCATATA 


9724 


CTTTAATAAA 


TGTATAtTGT 


ATTGTATACT 


GCATGATTTT 


ATTGAAGTTC 


TTGTTCATCT 


9784 


TGTGTATATA 


CTTAATCGCT 


TTGTCATTTT 


GGAGACATTT 


ATTTTGCTTC 


TAATTTCTTT 


9844 


ACATTTTGTC 


TTACGGAATA 


TTTTCATTCA 


ACTGTGGTAG 


CCGAATTAAT 


CGTGTTTCTT 


9904 


CACTCTAGGG 


ACATTGTCGT 


CTAAGTTGTA 


AGACATTGGT 


TATTTTACCA 


GCAAACCATT 


9964 


CTGAAAGCAT 


ATGACAAATT 


ATTTCTCTCT 


TAATATCTTA 


CTATACTGAA 


AGCAGACTGC 


10024 


TATAAGGCTT 


CACTTACTCT 


TCTACCTCAT 


AAGGAATATG 


TTACAATTAA 


TTTATTAGGT 


10084 


AAGCATTTGT 


TTTATATTGG 


TTTTATTTCA 


CCTGGGCTGA 


GATTTCAAGA 


AACACCCCAG 


10144 


TCTTCACAGT 


AACACATTTC 


ACTAACACAT 


TTACTAAACA 


TCAGCAACTG 


TGGCCTGTTA 


10204 


ATTTTTTTAA 


TAGAAATTTT 


AAGTCCTCAT 


TTTCTTTGGG 


TGTTTTTTAA 


GCTTAATTTT 


10264 


TCTGGCTTTA 


TTCATAAAT T 


CTTAAGGTCA 


ACTACATTTG 


AAAAATCAAA 


GACCTGCATT 


10324 


TTAAATTCTT 


ATTCACCTCT 


GGCAAAACCA 


TTCACAAACC 


ATGGTAGTAA 


AGAGAAGGGT 


10384 


GACACCTGGT 


GGCCATAGGT 


AAATGTACCA 


CGGTGGTCCG 


GTGACCAGAG 


ATGCAGCGCT 


10444 


GAGGGTTTTC 


CTGAAGGTAA 


AGGAATAAAG 


AATGGGTGGA 


GGGGCGTGCA 


CTGGAAATCA 


10504 


CTTGTAGAGA AAAGCCCCTG 


AAAATTTGAG 


AAAACAAACA 


AGAAACTACT 


TACCAGCTAT 


10564 


TTGAATTGCT 


GGAATCACAG 


GCCATTGCTG 


AGCTGCCTGA 


ACTGGGAACA 


CAACAGAAGG 


10624 
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AAAACAAACC ACTCTGATAA TCATTGAGTC AAGTACAGCA GGTGATTGAG GACTGCTGAG 10684 

AGGTACAGGC CAAAATTCTT ATGTTGTATT ATAATAATGT CATCTTATAA TACTGTCAGT 10744 

AT TT TAT AAA ACATTCTTCA CAAACTCACA CACATTTAAA AACAAAACAC TGTCTCTAAA 10804 

ATCCCCAAAT TTTTCATAAA C 10825 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 8 amino acids 
£B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION; SEQ ID NO:2: 

Met Gly Pro Arg Ala Arg Pro Ala Leu Leu Leu Leu Mez Leu Leu Gin 
I 5 10 15 

Thr Ala Val Leu Gin Gly Arg Leu Leu Arg Ser His Ser Leu His Tyr 
20 25 30 

Leu Phe Met Gly Ala Ser Glu Gin Asp Leu Gly Leu Ser Leu Phe Glu 
35 40 45 

Ala Leu Gly Tyr Val Asp Asp Gin Leu Phe Val Phe Tyr Asp His Glu 
50 55 60 

Ser Arg Arg Val Glu Pro Arg Thr Pro Trp Val Ser Ser Arg lie Ser 
65 70 75 80 

Ser Gin Met: Trp Leu Gin Leu Ser Gin Ser Leu Lys Gly Trp Asp His 
85 90 95 

Met Phe Thr Val Asp Phe Trp Thr lie Met Glu Asn His Asn His Ser 
100 105 110 

Lys Glu Ser His Thr Leu Gin Val lie Leu Gly Cys Glu Met Gin Glu 
115 120 125 

Asp Asn Ser Thr Glu Gly Tyr Trp Lys Tyr Gly Tyr Asp Gly Gin Asp 
130 135 140 

His Leu Glu Phe Cys Pro Asp Thr Leu Asp Trp Arg Ala Ala Glu Pro 
145 150 155 160 

Arg Ala Trp Pro Thr Lys Leu Glu Trp Glu Arg His Lys lie Arg Ala 
165 170 175 

Arg Gin Asn Arg Ala Tyr Leu Glu Arg Asp Cys Pro Ala Gin Leu Gin 
180 185 190 

Gin Leu Leu Glu Leu Gly Arg Gly Val Leu Asp Gin Gin Val Pro Pro 
195 200 205 

Leu Val Lys Val Thr His His Val Thr Ser Ser Val Thr Thr Leu Arg 
210 215 220 
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Cys Arg Ala Leu Asn Tyr Tyr Pro 
225 230 

Lys Asp Lys Gin Pro Met Asp Ala 
245 

260 



Pro Pro Gly Glu Glu Gin Arg Tyr 
275 280 

Leu Asp Gin Pro Leu He Val He 
290 295 

Leu Val He Gly Val He Ser Gly 
305 310 

Phe He Gly He Leu Phe He He 
325 

Gly Ala Met Gly His Tyr Val Leu 
340 



Gin Asn He Thr Met Lys Trp Leu 
235 240 

Lys Glu Phe Glu Pro Lys Asp Val 
250 255 

265 270 

Thr Cys Gin Val Glu His Pro Gly 
285 



Trp Glu Pro Ser Pro Ser Gly Thr 
300 

He Ala Val Ehe Val Val He Leu 

315 320 



Leu Arg Lys Arg Gin Gly Ser Arg 
330 335 

Ala Glu Arg Glu 
345 



(2) INFORMATION FOR SEQ ID NO; 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10825 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYP£: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

<B> LOCATION: join (361 43 6, 3762.. 4025, 4235.. 4510, 5606.. 5881, 

6040.. 6153, 7107,. 7147) 
(D) OTHER INFORMATION: /product^ "Hereditary Hemochromatosis 

(HH) protein containing the 24dl 
mutation" 

/note= "Hereditary Hemochromatosis (HH) 
gene 24dl allele" 

(ix) FEATURE: 

(A) NAME /KEY: - 

(B) LOCATION: 140.. 7319 

(D) OTHER INFORMATION: /note= "start and stop positions for 

24dl allele cDNA (SEQ ID NOriO)" 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 3852.. 3891 

(D) OTHER INFORMATION: /note= "start and stop positions for 

genomic sequence surrounding variant 
for 24d2(C) allele (SEQ ID NO: 41)" 

(ix) FEATURE: 

(A) NAME/KEY: - 
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(B) LOCATION: 5507.. 6023 

(D) OTHER INFORMATION: /note= "start and stop positions for 

genomic sequence surrounding variant 
for 24dl{A) allele (SEQ ID NO: 21)" 

(ix) FEATURE: 

(A) NAME/KEY: allele 

(B) LOCATION: replace (5834, "a") 

(D) OTHER INFORMATION: /phenotype* "Hereditary Hemochromatosis 

(RH) " 

/label= 24dl 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 



TCTAAGGTTG 


AGATAAAATT 


TTTAAATGTA 


TGATTGAATT 


T TGAAAAT C A 


TAAATATTTA 


60 


AATATCTAAA 


GTTCAGATCA 


GAACATTGCG 


AAGCTACTTT 


CCCCAATCAA 


CAACACCCCT 


120 


TCAGGATTTA 


AAAACCAAGG 


GGGACACTGG 


ATCACCTAGT 


GTTTCACAAG 


CAGGTACCTT 


180 


CTGCTGTAGG 


AGAGAGAGAA 


CTAAAGTTCT 


GAAAGACCTG 


TTGCTTTTCA 


CCAGGAAGTT 


240 


TTACTGGGCA 


TCTCCTGAGC 


CTAGGCAATA 


GCTGTAGGGT 


GACTTCTGGA 


GCCATCCCCG 


300 


TTTCCCCGCC 


CCCCAAAAGA 


AGCGGAGATT 


TAACGGGGAC 


GTGCGGCCAG 


AGCTGGGGAA 


360 


ATG GGC CCG CGA GCC AGG CCG GCG CTT CTC CTC 
Met Gly Pro Arg Ala Arg Pro Ala Leu Leu Leu 


CTG ATG CTT TTG CAG 
Leu Met Leu Leu Gin 


403 



1 5 10 15 



ACC GCG GTC CTG CAG GGG CGC TTG CTG C GTGAGTCCGA GGGCTGCGGG 456 
Thr Ala Val Leu Gin Gly Arg Leu Leu 





20 




25 








CGAACTAGGG 


G CGC GGC GGG 


GGTGGAAAAA 


TCGAAACTAG 


CTTTTTCTTT 


GCGCTTGGGA 


516 


GTTTGCTAAC 


TTTGGAGGAC 


CTGCTCAACC 


CTATCCGCAA 


GCCCCTCTCC 


CTACTTTCTG 


576 


CGTCCAGACC 


CCGTGAGGGA 


GTGCCTACCA 


CTGAACTGCA 


GATAGGGGTC 


CCTCGCCCCA 


636 


GGACCTGCCC 


CCTCCCCCGG 


CTGTCCCGGC 


TCTGCGGAGT 


GACTTTTGGA 


ACCGCCCACT 


696 


CCCTTCCCCC 


AACTAGAATG 


CTTTTAAATA 


AATCTCGTAG 


TTCCTCACTT 


GAGCTGAGCT 


756 


AAGCCTGGGG 


CTCCTTGAAC 


CTGGAACTCG 


GGTTTATTTC 


CAATGTCAGC 


TGTGCAGTTT 


316 


TTTCCCCAGT 


CATCTCCAAA 


CAGGAAGTTC 


TTCCCTGAGT 


GCTTGCCGAG 


AAGGCTGAGC 


876 


AAACCCACAG 


CAG GAT CCGC 


ACGGGGTTTC 


CACCTCAGAA 


CGAATGCGTT 


GGGCGGTGGG 


936 


GGCGCGAAAG 


AGTGGCGTTG 


GGGATCTGAA 


TTCTTCACCA 


TTCCACCCAC 


TTTTGGTGAG 


996 


ACCTGGGGTG 


GAGGTCTCTA 


GGGTGGGAGG 


CTCCTGAGAG 


AGGCCTACCT 


CGGGCCTTTC 


1056 


CCCACTCTTG 


GCAATTGTTC 


TTTTGCCTGG 


AAAATTAAGT 


ATATGTTAGT 


TTTGAACGTT 


1116 


TGAACTGAAC 


AATTCTCTTT 


TCGGCTAGGC 


TTTATTGATT 


TGCAATGTGC 


TGTGTAATTA 


1176 


AGAGGCCTCT 


CTACAAAGTA 


CTGATAATGA 


ACATGTAAGC 


AATGCACTCA 


CTTCTAAGTT 


1236 
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ACATTCATAT CTGATCTTAT TTGATTTTCA C TAGGC ATAG GGAGGTAGGA GCTAATAATA 1296 

CGTTTATTTT ACTAGAAGTT AACTGGAATT CAGATTATAT AACTCTTTTC AGGTTACAAA 1356 

GAACATAAAT AATCTGGTTT TCTGATGTTA TTTCAAGTAC TACAGCTGCT TCTAATCTTA 1416 

GTTGACAGTG ATTTTGCCCT GTAGTGTAGC ACAGTGTTCT GTGGGTCACA CGCCGGCCTC 1416 

AGCACAGCAC TTTGAGTTTT GGTACTACGT GTATCCACAT TTTACACATG ACAAGAATGA 1536 

GGCATGGCAC GGCCTGCTTC CTGGCAAATT TATTCAATGG TACACTGGGC TTTGGTGGCA 1596 

GAGCTCATGT CTCCACTTCA TAGCTATGAT TCTTAAACAT CACACTGCAT TAGAGGTTGA . 1656 

ATAATAAAAT TTCATGTTGA GCAGAAATAT TCATTGTTTA CAAGTGTAAA TGAGTCCCAG 1716 

CCATGTGTTG CACTGTTCAA GCCCCAAGGG AGAGAGCAGG GAAACAAGTC TTTACCCTTT 177 6 

GATATTTTGC ATTCTAGTGG GAGAG AT GAC AATAAGCAAA TGAGCAGAAA GATATACAAC 1B36 

ATCAGGAAAT CATGGGTGTT G T GAG AAGC A GAGAAGTCAG GGCAAGTCAC TCTGGGGCTG 1896 

ACACT T GAGC AGAGACATGA AGGAAATAAG AATGATATTG ACTGGGAGCA GTATTTCCCA 1956 

GGCAAACTGA GTGGGCCTGG CAAGTTGGAT TAAAAAGCGG GTTTTCTCAG CACTACTCAT 2016 

GTGTGTGTGT GTGGGGGGGG GGGGCGGCGT GGGGGTGGGA AGGGGGACTA CCATCTGCAT 2076 

GTAGGATGTC TAGCAGTATC CTGTCCTCCC TACT C ACT AG GTGCTAGGAG CACTCCCCCA 2136 

GTCTTGACAA CCAAAAATGT CTCTAAACTT TGCCACATGT CACCTAGTAG ACAAACTCCT 2196 

GGTTAAGAAG CTCGGGTTGA AAAAAATAAA CAAGTAGTGC TGGGGAGTAG AGGCCAAGAA 2256 

GTAGGTAATG GGCTCAGAAG AGGAGCCACA AACAAGGTTG TGCAGGCGCC TGTAGGCTGT 2316 

GGTGTGAATT CTAGCCAAGG AGTAACAGTG ATCTGTCACA GGCTTTTAAA AGATTGCTCT 2376 

GGCTGCTATG TGGAAAGCAG AATGAAGGGA GCAACAGTAA AAGCAGGGAG CCCAGCCAGG 2436 

AAGCTGTTAC ACAGTCCAGG CAAGAGGTAG TGGAGTGGGC TGGGTGGGAA CAGAAAAGGG 24 96 

AGTGACAAAC CATTGTCTCC TGAATATATT CTGAAGGAAG TTGCTGAAGG ATTCTATGTT 2556 

GTGTGAGAGA AAGAGAAGAA TTGGCTGGGT GTAGTAGCTC ATGCCAAGGA GGAGGCCAAG 2 616 

GAGAGCAGAT TCCTGAGCTC AGGAGTTCAA GACCAGCCTG GGCAACACAG CAAAACCCCT 267 6 

TCTCTACAAA AAATACAAAA ATTAGCTGGG TGTGGTGGCA TGCACCTGTG ATCCTAGCTA 2736 

CTCGGGAGGC TGAGGTGGAS GGTATTGCTT GAGCCCAGGA AGTTGAGGCT GCAGTGAGCC * 27 96 

ATGACTGTGC CACTGTACTT CAGCCTAGGT GACAGAGCAA GACCCTGTCT CCCCTGACCC 2856 

CCTGAAAAAG AGAAGAGTTA AAGTTGACTT TGTTCTTTAT TTTAATTTTA TTGGCCTGAG 2916 

CAGTGGGGTA ATTGGCAATG CCATTTCTGA GATGGTGAAG GCAGAGGAAA GAGCAGTTTG 2 976 

GGGTAAATCA AGGATCT GCA TTTGGGACAT GTTAAGTUTG AGATTCCAGT CAGGCTTCCA 3036 

AGTGGTGAGG CCACATAGGC AGTTCAGTGT AAGAATTCAG GACCAAGGCT GGGCACGGTG 3096 
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GCTCACTTCT GTAATCCCAG CACTTTGGTG GCTGAGGCAG GTAGATCATT TGAGGTCAGG 3156 

AGTTTGAGAC AAGCTTGGCC AACATGGTGA AACCCCATGT CTACTAAAAA TACAAAAATT 3216 

AGCCTGGTGT GGTGGCGCAC GCCTATAGTC CCAGGTTTTC AGGAGGCTTA GGTAGGAGAA 3276 

TCCCTTGAAC CCAGGAGGTG CAGGTTGCAG TGAGCTGAGA TTGTGCCACT GCACTCCAGC 3336 

CTGGGTGATA GAGTGAGACT CTGTCTCAAA AAAAAAAAAA AAAAAAAAAA AAAAAACTGA 3396 

AGGAAXTATT CCTCAGGATT TGGGTCTAAT TTGCCCTGAG CACCAACTCC TGAGTTCAAC 3456 

TACCATGGCT AGACACACCT TAACATTTTC TAGAATCCAC CAGCTTTAGT GGAGTCTGTC 3516 

TAATCATGAG TATTGGAATA GGATCTGGGG GCAGTGAGGG GGTGGCAGCC ACGTGTGGCA 3576 

GAGAAAAGCA CACAAGGAAA GAG C ACCCAG GACTGTCATA TGGAAGAAAG ACAGGACTGC 3636 

AACTCACCCT TCACAAAATG AGGACCAGAC ACAGCTGATG GTATGAGTTG ATGCAGGTGT 3696 

GTGGAGCCTC AACATCCTGC TCCCCTCCTA CTACACATGG TTAAGGCCTG TTGCTCTGTC 37 56 

TCCAG GT TCA CAC TCT CTG CAC TAG CTC TTC ATG GGT GCC TCA GAG 3802 
Arg Ser His Ser Leu His Tyr Leu Phe Met Gly Ala Ser Glu 
30 35 



GAG GAC GTT GGT CTT TCC TTG TTT GAA GCT TTG GGC TAG GTG GAT GAC 3850 
Gin Asp Leu Gly Leu Ser Leu Phe Glu Ala Leu Gly Tyr val Asp Asp 
40 45 50 55 

CAG CTG TTC GTG TTC TAT GAT CAT GAG AGT CGC CGT GTG GAG CCC CGA 3898 
Gin Leu Phe Val Phe Tyr Asp His Glu Ser Arg Arg Val Glu Pro Arg 
60 65 70 

ACT CCA TGG GTT TCC AGT AGA ATT TCA AGC CAG ATG TGG CTG CAG CTG 394 6 

Thr Pro Trp Val Ser Ser Arg lie Ser Ser Gin Met Trp Leu Gin Leu 
75 SO 85 

AGT CAG AGT CTG AAA GGG TGG GAT CAC ATG TTC ACT GTT GAC TTC TGG 3994 
Ser Gin Ser Leu Lys Gly Trp Asp His Met Phe Thr Val Asp Phe Trp 
90 95 100 

ACT ATT ATG GAA AAT CAC AAC CAC AGC AAG G GTATGTGGAG AGGGGGCCTC 4 045 

Thr lie Met Glu Asn His Asn His Ser Lys 
105 110 

ACCTTCCTGA GGTTGTCAGA GCTTTTCATC TTTTCATGCA TCTTGAAGGA AACAGCTGGA 4105 

AGTCTGAGGT CTTGTGGGAG CAGGGAAGAG GGAAGGAATT TGCTTCCTGA GATCATTTGG 4165 

TCCTTGGGGA TGGTGGAAAT AGGGACCTAT TCCTTTGGTT GCAGTTAACA AGGCTGGGGA 4 225 

TTTTTCCAG AG TCC CAC ACC CTG CAG GTC ATC CTG GGC TGT GAA ATG 4272 
Glu Ser His Thr Leu Gin Val lie Leu Gly Cys Glu Met 
115 120 125 

CAA GAA GAC AAC AGT ACC GAG GGC TAC TGG AAG TAC GGG TAT GAT GGG 4320 
Gin Glu Asp Asn Ser Thr Glu Gly Tyr Trp Lys Tyr Gly Tyr Asp Gly 
130 135 140 
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CAG GAC CAC CTT GAA TTC TGC CCT GAC ACA CTG GAT TGG AGA GCA GCA 4368 
Gin Asp His Leu Glu Phe Cys Pro Asp Thr Leu Asp Trp Arg Ala Ala 
145 150 155 

GAA CCC AGG GCC TGG CCC ACC AAG CTG GAG TGG GAA AGG CAC AAG ATT 4416 
Glu Pro Arg Ala Trp Pro Thr Lys Leu Glu Trp Glu Arg His Lys He 
IfiO 165 170 

CGG GCC AGG CAG AAC AGG GCC TAC CTG GAG AGG GAC TGC CCT GCA CAG 4 4 64 
Arg Ala Arg Gin Asn Arg Ala Tyr Leu Glu Arg Asp Cys Pro Ala Gin 
175 - 180 185 190 

CTG CAG CAG TTG CTG GAG CTG GGG AGA GGT GTT TTG GAC CAA CAA G 4310 
Leu Gin Gin Leu Leu Glu Leu Gly Arg Gly Val Leu Asp Gin Gin 
195 200 205 

GTATGGTGGA AACACACTTC TGCCCCTATA CTCTAGTGGC AGAGTGGAGG AGGTTGCAGG 457 0 

GCACGGAATC CCTGGTTGGA GTTTCAGAGG TGGCTGAGGC TGTGTGCCTC TCCAAATTCT 4630 

GGGAAGGGAC TTTCTCAATC CTAGAGTCTC TACCTTATAA TTGAGATGTA TGAGACAGCC 4 690 

AC AAGT CAT G GGTTTAATTT CTTTTCTCCA TGCATATGGC TCAAAGGGAA GTGTCTATGG 4750 

CCCTTGCTTT TTATTTAACC AATAATCTTT TGTATATTTA TACCTGTTAA AAATTCAGAA 4810 

ATGTCAAGGC CGGGCACGGT GGCTCACCCC TGTAATCCCA GCACTTTGGG AGGCCGAGGC 4 87 0 

GGGTGGTCAC AAGGTCAGGA GTTTGAGACC AG CCT GAC C A ACATGGTGAA ACCCGTCTCT 4 930 

AAAAAAATAC AAAAATTAGC TGGTCACAGT CATGCGCACC TGTAGTCCCA GCTAATTGGA 4 990 

AGGCTGAGGC AGGAGCATCG CTTGAACCTG GGAAGCGGAA GTTGCACTGA GCCAAGATCG 5050. 

CGCCACTGCA CTCCAGCCTA GGCAGCAGAG TGAGACTCCA TCTTAAAAAA AAAAAAAAAA 5110 

AAAAAAAGAG AATTCAGAGA TCTCAGCTAT CATATGAATA CCAGGACAAA ATATCAAGTG 517 0 

AGGCCACTTA TCAGAGTAGA AGAATCCTTT AGGTTAAAAG TTTCTTTCAT AGAACATAGC 5230 

AATAATCACT GAAGCTACCT ATCTTACAAG TCCGCTTCTT ATAACAATGC CTCCTAGGTT 5290 

GACCCAGGTG AAACTGACCA TCTGTATTCA ATCATTTTCA AT GCACATAA AGGGCAATTT 5350 

TATCTATCAG AACAAAGAAC ATGGGTAACA GATATGTATA TTTACATGTG AGGAGAACAA 5410 

GCTGATCTGA CTGCTCTCCA AGT GACACTG TGTTAGAGTC CAATCTTAGG ACACAAAATG 5470 

GTGTCTCTCC TGTAGCTTGT TTTTTTCTGA AAAGGGTATT TCCTTCCTCC AACCTATAGA 5530 

AGGAAGTGAA AGTTCCAGTC TTCCTGGCAA GGGTAAACAG ATCCCCTCTC CTCATCCTTC 5590 

CTCTTTCCTG TCAAG TG CCT CCT TTG GTG AAG GTG ACA CAT CAT GTG ACC 564 0 
Val Pro Pro Leu Val Lys Val Thr His His Val Thr 
210 215 

TCT TCA GTG ACC ACT CTA CGG TGT CGG GCC TTG AAC TAC TAC CCC CAG 568 8 
Ser Ser Val Thr Thr Leu Arg Cys Arg Ala Leu Asn Tyr Tyr Pro Gin 
220 225 230 

AAC ATC ACC ATG AAG TGG CTG AAG GAT AAG CAG CCA ATG GAT GCC AAG 57 36 
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Asn lie Thr Met Lys Trp Leu Lys Asp Lys Gin Pro Met Asp Ala LyS 
235 240 245 

GAG TTC GAA CCT AAA GAC GTA TTG CCC AAT GGG GAT GGG ACC TAC CAG 57 94 

Glu Phe Glu Pro Lys Asp Val Leu Pro Asn Gly Asp Gly Thr Tyr Gin 
250 255 260 265 

GGC TGG ATA ACC TTG GCT GTA CCC CCT GGG GAA GAG CAG AGA TAT ACG 5832 
Gly Trp He Thr Leu Ala Val Pro Pro Gly Glu Glu Gin Arg Tyr Thr 
270 275 280 

TAG CAG GTG GAG CAC CCA GGC CTG GAT CAG CCC CTC ATT GTG ATC TGG G 5881 
Tyr Gin Val Glu His Pro Gly Leu Asp Gin Pro Leu lie Val He Trp 
285 290 295 

GTATGTGACT GATGAGAGCC AGGAGCTGAG AAAATCTATT GGGGGTTGAG AGGAGTGCCT 5 941 

GAGGAGGTAA TTATGGCAGT GAGATGAGGA TCTGCTCTTT GTTAGGGGGT GGGCTGAGGG 6001 

TGGCAATCAA AGGCTTTAAC TTGCTTTTTC TGTTTTAG AG CCC TCA CCG TCT 6053 

Glu Pro Ser Pro Ser 
300 

GGC ACC CTA GTC ATT GGA GTC ATC AGT GGA ATT GCT GTT TTT GTC GTC 6101 
Gly Thr Leu Val lie Gly Val He Ser Gly He Ala Val Phe Val Val 
305 310 315 

ATC TTG TTC ATT GGA ATT TTG TTC ATA ATA TTA AGG AAG AGG CAG GGT 614 9 

He Leu Phe He Gly He Leu Phe He He Leu Arg Ly3 Arg Gin Gly 
320 325 330 

TCA A GTGAGTAGGA ACAAGGGGGA AGTCTCTTAG TACCTCTGCC CCAGGGCACA 6203 



Ser 
335 














GTGGGAAGAG 


GGGCAGAGGG 


GATCTGGCAT 


CCATGGGAAG 


CATTTTTCTC 


ATTTATATTC 


6263 


TTTGGGGACA 


CCAGCAGCTC 


CCTGGGAGAC 


AGAAAATAAT 


GGTTCTCCCC 


AGAATGAAAG 


6323 


TCTCTAATTC 


AACAAACATC 


TTCAGAGCAC 


CTACTATTTT 


GCAAGAGCTG 


TTTAAGGTAG 


6383 


TACAGGGGCT 


TTGAGGTTGA 


GAAGTCACTG 


TGGCTATTCT 


CAGAACCCAA 


ATCTGGTAGG 


6443 


GAATGAAATT 


GATAGCAAGT 


AAATGTAGTT 


AAAGAAGACC 


CCATGAGGTC 


CTAAAGCAGG 


6503 


CAGGAAGCAA 


ATGCTTAGGG 


TGTCAAAGGA 


AAGAATGATC 


ACATTCAGCT 


GGGGATCAAG 


6563 


ATAGCCTTCT 


GGATCTTGAA 


GGAGAAGCTG 


GATTCCATTA 


GGTGAGGTTG 


AAG AT GAT GG 


6623 


GAGGTCTACA 


CAGACGGAGC 


AACCATGCCA 


AGTAGGAGAG 


TATAAGGCAT 


ACTGGGAGAT 


6683 


TAGAAATAAT 


TACT GT ACC T 


TAACCCTGAG 


TTTGCGTAGC 


TATCACTCAC 


CAATTATGCA 


6743 


TTTCTACCCC 


CTGAACATCT 


GTGGTGTAGG 


GAAAAGAGAA 


TCAGAAAGAA 


GCCAGCTCAT 


6803 


ACAGAGTCCA 


AGGGTCTTTT 


GGGATATTGG 


GTTATGATCA 


CTGGGGTGTC 


ATTGAAGGAT 


6863 


CCTAAGAAAG 


GAGGACCACG 


ATCTCCCTTA 


TATGGTGAAT 


GTGTTGTTAA 


GAAGTTAGAT 


6923 


GAGAGGTGAG 


GAG ACC AGT T 


AGAAAGCCAA 


TAAGCATTTC 


CAGATGAGAG 


ATAATGGTTC 


6983 
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TTGAAATCCA ATAGTGCCCA GGTCTAAATT GAGATGGGTG AATGAGGAAA ATAAGGAAGA 704 3 

GAGAAGAGGC AAGATGGTGC CTAGGTTTGT GATGCCTCTT TCCTGGGTCT CTTGTCTCCA 7103 

CAG GA GGA GCC ATG GGG CAC TAC GTC TTA GCT GAA CGT GAG 714 4 
Arg Gly Ala Met Gly His Tyr Val Leu Ala Glu Arg Glu 
340 345 

TGACACGCAG CCTGCAGACT CACTGTGGGA AGGAGACAAA ACTAGAGACT CAAAGAGGGA 7204 

GTGCATTTAT GAGCTCTTCA TGTTTCAGGA GAGAGTTGAA CCTAAACATA GAAATTGCCT 7264 

GACGAACTCC TTGATTTTAG CCTTCTCTGT TCATTTCCTC AAAAAGATTT CCCCATTTAG 7324 

GTTTCTGAGT TCCTGCATGC CGGTGATCCC TAGCTGTGAC CTCTCCCCTG GAACTGTCTC 7384 

T CAT GAACC T CAAGCTGCAT CTAGAGGCTT CCTTCATTTC CTCCGTCACC TCAGAGACAT 74 4 4 

ACACCTATGT CATTTCATTT CCTATTTTTG GAAGAGGACT CCTTAAATTT GGGGGACTTA 7504 

CATGATTCAT TTTAACATCT GAGAAAAGCT TTGAACCCTG GGACGTGGCT AGT CAT AACC 7564 

TTACCAGATT TTTACACATG TATCTATGCA TTTTCTGGAC CCGTTCAACT TTTCCTTTGA 7624 

ATCCTCTCTC TGTGTTACCC AGTAACTCAT CTGTCACCAA GCCTTGGGGA TTCTTCCATC 7 684 

TGATTGTGAT GTGAGTTGCA CAGCTATGAA GGCTGTACAC TGCACGAATG GAAGAGGCAC 7744 

CTGTCCCAGA AAAAGCATCA TGGCTATCTG TGGGTAGTAT GATGGGTGTT TTTAGCAGGT 7804 

AGGAGGCAAA TATCTTGAAA GGGGTTGTGA AGAGGTGTTT TTTCTAATTG GCATGAAGGT 7364 

GTCATACAGA TTTGCAAAGT TTAATGGTGC CTTCATTTGG GATGCTACTC TAGTATTCCA 7 92 4 

GACCTGAAGA ATCACAATAA TTTTCTACCT GGTCTCTCCT TGTTCTGATA ATGAAAATTA 7 984 

TGATAAGGAT GATAAAAGCA CTTACTTCGT GTCCGACTCT TCTGAGCACC TACTTACATG 804 4 

CATTACTGCA TGCACTTCTT ACAATAATTC TATGAGATAG GTACT AT TAT CCCCATTTCT 8104 

TTTXTAAATG AAGAAAGT GA AGTAGGCCGG GCACGGTGGC TCACGCCTGT AATCCCA6CA 8164 

CTTTGGGAGG CCAAAGCGGG TGGATCACGA GGTCAGGAGA TCGAGACCAT CCTGGCTAAC 9224 

ATGGTGAAAC CCCATCTCTA ATAAAAATAC AAAAAATTAG CTGGGCGTGG TGGCAGACGC 8284 

CTGTAGTCCC AGCTACTCGG AAGGCTGAGG CAGGAGAATG GCATGAACCC AGGAGGCAGA 834 4 

GCTTGCAGTG AGCCGAGTTT GCGCCACTGC ACTCCAGCCT AGGTGACAGA GTGAGACTCC 8 404 

ATCTCAAAAA AATAAAAATA AAAATAAAAA AATGAAAAAA AAAAGAAAGT GAAGTATAGA 84 64 

GTATCTCATA GTTTGTCAGT GATAGAAACA GGTTTCAAAC TCAGTCAATC TGACCGTTTG 8524 

ATACATCTCA GACACCACTA CAT T CAGT AG TTTAGATGCC TAGAATAAAT AGAGAAGGAA 8584 

GGAGATGGCT CTTCTCTTGT CTCATTGTGT TTCTTCTGAG TGAGCTTGAA TCACATGAAG 8644 

GGGAACAGCA GAAAACAACC AACTGATCCT CAGCTGTCAT GTTTCCTTTA AAAGTCCCTG 8704 

AAGGAAGGTC CTGGAATGTG ACTCCCTTGC TCCTCTGTTG CTCTCTTTGG CATTCATTTC 87 64 
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TTTGGACCCT ACGCAAGGAC TGTAATTGGT GGGGACAGCT AGTGGCCCTG CTGGGCTTCA 8824 

CACACGGTGT CCTCCCTAGG CCAGTGCCTC TGGAGTCAGA ACTCTGGTGG TATTTCCCTC 8884 

AATGAAGTGG AGTAAGCTCT CTCATTTTGA GATGGTATAA TGGAAGCCAC CAAGTGGCTT 8944 

AGAGGATGCC CAGGTCCTTC CATGGAGCCA CTGGGGTTCC GGTGCACATT AAAAAAAAAA 9004 

TCTAACCAGG ACATTCAGGA ATTGCTAGAT XCTGGGAAAT CAGTTCACCA TGTTCAAAAG 9064 

AGTCTTTTTT TTTTTTTTGA GACTCTATTG CCCAGGCTGG AGTGCAATGG CATGATCTCG 9124 

GCTCACTGTA ACCTCTGCCT CCCAGGTTCA AGCGATTCTC CTGTCTCAGC CTCCCAAGTA 9X84 

GCTGGGATTA CAGGCGTGCA CCACCATGCC CGGCTAATTT TTGTATTTTT AGTAGAGACA 9244 

GGGTTTCACC ATGTTGGCCA GGCTGGTCTC GAACTCTCCT GACCTCGTGA TCCGCCTGCC 9304 

TCGGCCTCCC AAAGTGCTGA GATTACAGGT GTGAGCCACC CTGCCCAGCC GTCAAAAGAG 9364 

TCTTAATATA TATATCCAGA TGGCATGTGT TTACTTTATG TTACTACATG CACTTGGCTG 94 24 

CATAAATGTG GTACAAGCAT TCTGTCTTGA AGGGCAGGTG CTTCAGGATA C CAT AT ACAG 94 84 

CTCAGAAGTT TCTTCTTTAG GCATTAAATT TTAGCAAAGA TATCTCATCT CTTCTTTTAA 9544 

ACCATTTTCT TTTTTTGTGG TTAGAAAAGT TATGTAGAAA AAAGTAAATG TGATTTACGC 9604 

TCATTGTAGA AAAGCTATAA AATGAATACA ATTAAAGCTG TTATTTAATT AGCCAGTGAA 9664 

AAACTATTAA CAACT TGTCT ATTACCTGTT AGTATTATTG TTGCATTAAA AATGCATATA 9724 

CTTTAATAAA TGTATATTGT ATTGTATACT GCATGATTTT ATTGAAGTTC TTGTTCATCT 97 34 

TGTGTATATA CTTAATCGCT TTGTCATTTT GGAGACATTT ATTTTGCTTC TAATTTCTTT 9844 

ACATTTTGTC TTACGGAATA TTTTCATTCA ACTGTGGTAG CCGAATTAAT CGTGTTTCTT 9904 

CACTCTAGGG ACATTGTCGT CTAAGTTGTA AGACATTGGT TATTTTACCA GCAAACCATT 9964 

CTGAAAGCAT ATGACAAATT ATTTCTCTCT TAATATCTTA CT AT ACT GAA AGCAGACTGC 10024 

TATAAGGCTT CACTTACTCT TCTACCTCAT AAGGAATATG TTACAATTAA TTTATTAGGT 10084 

AAGCATTTGT TTTATATTGG TTTTATTTCA CCTGGGCTGA GATTTCAAGA AACACCCCAG 10144 

TCTTCACAGT AACACATTTC ACTAACACAT TTACTAAACA TCAGCAACTG TGGCCTGTTA 10204 

ATTTTTTTAA TAGAAATTTT AAGTCCTCAT TTTCTTTCGG TGTTTTTTAA GCTTAATTTT 10264 

TCTGGCTTTA TTCATAAATT CTTAAGGTCA ACTACATTTG AAAAATCAAA GACCTGCATT 10324 

TTAAATTCTT ATTCACCTCT GGCAAAACCA TTCACAAACC ATGGTAGTAA AGAGAAGGGT 10384 

GACACCTGGT GGCCATAGGT AAATGTACCA CGGfGGTCCG GTGACCAGAG ATGCAGCGCT 104 44 

GAGGGTTTTC CTGAAGGTAA AGGAATAAAG AATGGGTGGA GGGGCGTGCA CTGGAAATCA 10504 

CTTGTAGAGA AAAGCCCCTG AAAATTTGAG AAAACAAACA AGAAACTACT TACCAGCTAT 10564 
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TTGAATTGCT GGAATCACAG GCCATTGCTG AGCTGCCTGA ACTGGGAACA CAACAGAAGG 10624 

AAAACAAACC ACT CT GAT AA TCATTGAGTC AAGTACAGCA GGTGATTGAG GACTGCTGAG 10684 

AGGTACAGGC CAAAATTCTT ATGTTGTATT ATAATAATGT CATCTTATAA TACTGTCAGT 10744 

ATTTTATAAA ACATTCTTCA CAAACTCACA CACAT TT AAA AACAAAACAC TGTCTCTAAA 10804 

ATCCCCAAAT TTTTCATAAA C 10825 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34$ amino acicis 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Gly Pro Arg Ala Arc, Pro Ala Leu Leu Leu Leu Mac Leu Leu Gin 
15 10 15 

Thr Ala Val Leu Gin Gly Arg Leu Leu Arg Ser His Ser Leu His Tyr 
20 25 30 

Leu Phe Met Gly Ala Ser Glu Gin Asp Leu Gly Leu Ser Leu Phe Glu 
35 40 45 

Ala Leu Gly Tyr Val Asp Asp Gin Leu Phe val Phe Tyr Asp His Glu 
50 55 60 

Ser Arg Arg Val Glu Fro Arg Thr Pro Trp Val Ser Ser Arg lie Ser 
65 70 75 80 

Ser Gin Met Trp Leu Gin Leu Ser Gin Ser Leu Lys Gly Trp Asp His 
$5 90 95 

Met Phe Thr Val Asp Phe Trp Thr lie Met Glu Asn Hi3 Asn His Ser 
100 105 110 

Lys Glu Ser His Thr Leu Gin Val lie Leu Gly Cys Glu Met Gin Glu 
115 120 125 

Asp Asn Ser Thr Glu Gly Tyr Trp Lys Tyr Gly Tyr Asp Gly Gin Asp 
130 135 140 

His Leu Glu Phe Cys Pro Asp Thr Leu Asp Trp Arg Ala Ala Glu Pro 
145 150 155 160 

Arg Ala Trp Pro Thr Lys Leu Glu Trp Glu Arg His Lys lie Arg Ala 
165 170 175 

Arg Gin Asn Arg Ala Tyr Leu Glu Arg Asp Cys Pro Ala Gin Leu Gin 
180 185 190 

Gin Leu Leu Glu Leu Gly Arg Gly Val Leu Asp Gin Gin Val Pro Pro 
195 200 205 
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Leu Val Lys Val Thr Hia His Val Thr Ser Ser Val Thr Thr Leu Arg 
210 215 220 

Cys Arg Ala Leu Asn Tyr Tyr Fro Gin Asn lie Thr Met Lys Trp Leu 
225 230 235 240 

Lys Asp Lys Gin Pro Met Asp Ala Lys Glu Phe Glu Pro Lys Asp Val 
245 250 255 

Leu Pro Asn Gly Asp Gly Thr Tyr Gin Gly Trp lie Thr Leu Ala Val 
260 265 270 

Pro Pro Gly Glu Glu Gin Arg Tyr Thr Tyr Gin Val Glu His Pro Gly 
275 280 295 

Leu Asp Gin Pro Leu lie Val He Trp Glu Pro Ser Pro Ser Gly Thr 
290 295 300 

Leu Val He Gly Val He Ser Gly He Ala Val Phe Val Val He Leu 
305 310 315 320 

Phe He Gly He Leu Phe lie He Leu Arg Lys Arg Gin Gly Ser Arg 
325 330 335 

Gly Ala Met Gly His Tyr Val Leu Ala Glu Arg Glu 
340 345 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 10825 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY : CDS 

(B) LOCATION: join (361 .. 436, 3762,, 4025, 4235-. 4510, 5606.. 5881, 

6040. .6153, 7107. ,7147) 
(D) OTHER INFORMATION: /product* 3 "Hereditary Hemochromatosis 

(HH) protein containing the 24d2 
mutation" 

/note« "Hereditary Hemochromatosis (HH) 
gene 24d2 allele" 

(ix) FEATURE: 

(A) NAME/KEY: - 

(B) LOCATION: 140. ,7319 

(D) OTHER INFORMATION: /note= "start and s Cop positions for 

24d2 allele cDNA (SEQ ID NO: ID" 



(ix) FEATURE: 

(A) NAME /KEY : - 

(B) LOCATION; 3852.-3891 

(D) OTHER INFORMATION: /note= "start and stop positions for 

genomic sequence surrounding variant 
for 24d2(G) allele (SEQ ID NO:42) " 
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(ix) FEATURE: 

(A) NAME /KEY: - 

(B) LOCATION: 5507.. 6023 

(D) OTHER INFORMATION; /note= "start and stop positions for 

genomic sequence surrounding variant 
for 24dl(G) allele (SEQ ID NO:20) ,r 

(ix) FEATURE: 

(A) NAME /KEY : allele 

(B) LOCATION: replace (3872, »g») 

(D) OTHER INFORMATION: /phenotype= "Hereditary Hemochromatosis 

(HH) " 

/labels 24d2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

TCTAAGGTTG AGATAAAATT TTTAAATGTA TGATTGAATT TTGAAAATCA TAAATATTTA 60 

AATATCTAAA GTTCAGATCA GAACATTGCG AAGCTACTTT CCCCAATCAA CAACACCCCT 120 

TCAGGATTTA AAAACCAAGG GGGACACTGG ATCACCTAGT GTTTCACAAG CAGGTACCTT 180 

CTGCTGTAGG AGAGAGAGAA CTAAAGTTCT GAAAGACCTG TTGCTTTTCA CCAGGAAGTT 24 0 

TTACTGGGCA TCTCCTGAGC CTAGGCAATA GCTGTAGGGT GACTTCTGGA GCCATCCCCG 300 

TTTCCCCGCC CCCCAAAAGA AGCGGAGATT TAACGGGGAC GTGCGGCCAG AGCTGGGGAA 360 

ATG GGC CCG CGA GCC AGG CCG GCG CTT CTC CTC CTG ATG CTT TTG CAG 408 
Mqc Gly Pro Arg Ala Arg Pro Ala Leu Leu Leu Leu Met Leu Leu Gin 
15 10 15 

ACC GCG GTC CTG CAG GGG CGC TTG CTG C GTGAGTCCGA GGGCTGCGGG 4 55 

Thr Ala vai Leu Gin Gly Arg Leu Leu 





20 




25 








CGAACTAGGG 


GCGCGGCGGG 


GGTGGAAAAA 


TCGAAACTAG 


CTTTTTCTTT 


GCGCTTGGGA 


516 


GTTTGCTAAC 


TTTGGAGGAC 


CTGCTCAACC 


CTATCCGCAA 


GCCCCTCTCC 


CTACTTTCTG 


576 


CGTCCAGACC 


CCGTGAGGGA 


GTGCCTACCA 


CTGAACTGCA 


GATAGGGGTC 


CCTCGCCCCA 


636 


GGACCTGCCC 


CCTCCCCCGG 


CTGTCCCGGC 


TCTGCGGAGT 


GACTTTTGGA 


ACCGCCCACT 


696 


CCCTTCCCCC 


AACTAGAATG 


CTTTTAAATA 


AATCTCGTAG 


TTCCTCACTT 


GAGCTGAGCT 


756 


AAGCCTGGGG 


CTCCTTGAAC 


CTGGAACTCG 


GGTTTATTTC 


CAATGTCAGC 


TGTGCAGTTT 


816 


TTTCCCCAGT 


CATCTCCAAA 


CAGGAAGTTC 


TTCCCTGAGT 


GCTTGCCGAG 


AAGGCTGAGC 


876 


AAACCCACAG 


CAGGATCCGC 


ACGGGGTTTC 


CACCT CAGAA 


CGAATGCGTT 


GGGCGGTGGG 


936 


GGCGCGAAAG 


AG T GGCGTT G 


GGGATCTGAA 


TTCTTCACCA 


TTCCACCCAC 


TTTTGGTGAG 


996 


ACCTGGGGTG 


GAGGTCTCTA 


GGGTGGGAGG 


CTCCTGAGAG 


AGGCCTACCT 


CGGGCCTTTC 


1056 


CCCACTCTTG 


GCAATTGTTC 


TTTTGCCTGG 


AAAATTAAGT 


ATATGTTAGT 


TTTGAACGTT 


1116 


TGAACTGAAC 


AATTCTCTTT 


TCGGCTAGGC 


TTTATTGATT 


TGCAATGTGC 


TGTGTAATTA 


1176 
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AGAGGCCTCT CTACAAAGTA CTGATAATGA ACATGTAAGC AATGCACTCA CTTCTAAGTT 1236 

ACATTCATAT CTGATCTTAT TTGATTTTCA CTAGGCATAG GGAGGTAGGA GCTAATAATA 1296 

CGTTTATTTT ACTAGAAGTT AACTGGAATT CAGATTATAT AACTCTTTTC AGGTTACAAA 1356 

GAACATAAAT AATCTGGTTT TCTGATGTTA TTTCAAGTAC TACAGCTGCT TCTAATCTTA 1416 

GTTGACAGTG ATTTTGCCCT GTAGTGTAGC ACAGTGTTCT GTGGGTCACA CGCCGGCCTC 147 6 

AGCACAGCAC TTTGAGTTTT GGTACTACGT GTATCCACAT TTTACACATG ACAAGAATGA 1536 

GGCATGGCAC GGCCTGCTTC CTGGCAAATT TATTCAATGG TACACTGGGC TTTGGTGGCA 1596 

GAGCTCATGT CTCCACTTCA TAGCTAT GAT TCTTAAACAT CACACTGCAT TAGAGGTTGA 165 6 

ATAATAAAAT TTCATGTTGA GCAGAAATAT TCATTGTTTA CAAGTGTAAA TGAGTCCCAG 1716 

CCATGTGTTG CACTGTTCAA GCCCCAAGGG AGAGAGCAGG GAAACAAGTC TTTACCCTTT 17 7 6 

GATATTTTGC ATTCTAGTGG GAGAGATGAC AATAAGCAAA TGAGCAGAAA GATATACAAC 1836 

ATCAGGAAAT CATGGGTGTT GTGAGAAGCA GAGAAGTCAG GGCAAGTCAC TCTGGGGCTG 18 9 6 

ACACTTGAGC AGAGACATGA AGGAAATAAG AATGATATTG ACTGGGAGCA GTATTTCCCA 1956 

GGCAAACTGA GTGGGCCTGG CAAGTTGGAT TAAAAAGCGG GTTTTCTCAG CAC TACT CAT 2016 

GTGTGTGTGT GTGGGGGGGG GGGGCGGCGT GGGGGTGGGA AGGGGGACTA CCATCTGCAT 2076 

GTAGGATGTC TAGCAGTATC CTGTCCTCCC TACTCACTAG GTGCTAGGAG CACTCCCCCA 2136 

GTCTTGACAA CCAAAAATGT CTCTAAACTT TGCCACATGT CACCTAGTAG ACAAACTCCT 2196 

GGTTAAGAAG CTCGGGTTGA AAAAAATAAA CAAGTAGTGC TGGGGAGTAG AGGCCAAGAA 2256 

GTAGGTAATG GGCTCAGAAG AGGAGCCACA AACAAGGTTG TGCAGGCGCC TGTAGGCTGT 2316 

GGTGTGAATT CTAGCCAAGG AGTAACAGTG ATCTGTCACA GGCTTTTAAA AGATTGCTCT 2376 

GGCTGCTATG TGGAAAGCAG AATGAAGGGA GCAACAGTAA AAGCAGGGAG CCCAGCCAGG 2436 

AAGCTGTTAC ACAGTCCAGG CAAGAGGTAG TGGAGTGGGC TGGGTGGGAA CAGAAAAGGG 24 96 

AGTGACAAAC CATTGTCTCC TGAATATATT CTGAAGGAAG TTGCTGAAGG ATTCTATGTT 2556 

GTGTGAGAGA AAGAGAAGAA TTGGCTGGGT GTAGTAGCTC ATGCCAAGGA GGAGGCCAAG 2 616 

GAGAGCAGAT TCCTGAGCTC AGGAGTTCAA GACCAGCCTG GGCAACACAG CAAAACCCCT 267 6 

TCTCTACAAA AAATACAAAA ATTAGCTGGG TGTGGTGGCA TGCACCTGTG ATCCTAGCTA 2736 

CTCGGGAGGC TGAGGTGGAG GGTATTGCTT GAGCCCAGGA AGTTGAGGCT GCAGT GAGCC 2796 

ATGACTGTGC CACTGTACTT CAGCCTAGGT GACAGAGCAA 6ACCCTGTCT CCCCTGACCC 2856 

CCTGAAAAAG AG AAGAGT T A AAGTTGACTT TGTTCTTTAT TTTAATTTTA TTGGCCTGAG 2916 

CAGTGGGGTA ATTGGCAATG CCATTTCTGA GATGGTGAAG GCAGAGGAAA GAGCAGTTTG 297 6 
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G6GTAAATCA 




T T T Cits SAC AT 


CjTTAAGTTTS 


AG AT T CC AvsT 


l^Atjtj^ X i vv« 


jujd 


AGTGGTGAGG 


CCACATAGGC 


AGTTCAGTGT 


AAGAATTCAG 


GACCAAGGCT 


GGGCACGGTo 


JUS b 


GCTCACTTCT 


GTAATCC CAG 


CACTTTGGTG 


GCTGAGGCAG 


G T AGAT CAT T 


T GAG GT UAb (j 


71 Ctf 


agtttgagac 


AAGCTTGGCC 


AACATGGTGA 


AACCCCATGT 


CTACTAAAAA 


TACAAAAAi 1 




agcctggtgt 


GGTGGCGCAC 


GCCTATAGTC 


CCAGGTTTTC 


AGGAGGCTTA 


GGTAGGAGlAA 


42 / b 


tcccttgaac 


CCAGGAGGTG 


C AGGT T G CAG 


T GAG CT GAGA 


TTGTGCCACT 


GCACTCCAGC 


JJJO 


ctgggtgata 


GAGTGAGACT 


CTGTCTCAAA 


AAAAAAAAAA 


AAAAAAAAAA 


AAAAAACTGA 


3396 


AGGAATTATT 


CCTCAGGATT 


TGGGTCTAAT 


TTGCCCTGAG 


CACCAACTCC 


TGAGTTCAAC 


3456 


TACCATGGCT 


AGACACACCT 


TAACATTTTC 


TAGAATCCAC 


CAGCTTTAGT 


GGAGTCTGTC 


351 6 


TAATCATGAG 


TATTGGAATA 


GGATCTGGGG 


GCAGTGAGGG 


GGTGGCAGCC 


ACGTGTGGCA 


3576 


GAGAAAAGCA 


CACAAGGAAA 


GAGCACCCAG 


GACTGTCATA 


TGGAAGAAAG 


ACAGGACTGC 


3636 


AACTCACCCT 


TCACAAAATG 


AGGACCAGAC 


ACAGCTGATG 


GTATGAGTTG 


ATGCAGGTGT 


3696 


GTGGAGCCTC 


AACATCCTGC 


TCCCCTCCTA 


CTACACATGG 


TTAAGGCCTG 


TTGCTCTGTC 


3756 


TCCAG GT TCA CAC TCT 
Arg Ser His Ser 


CTG CAC TAG CTC TTC ATG GGT GCC TCA GAG 
Leu His Tyr Leu Phe Met Gly Ala Ser Glu 


3802 



30 35 

CAG GAC CTT GGT CTT TCC TTG TTT GAA GCT TTG GGC TAC GTG GAT GAC 3850 
Gin Asp Leu Gly Leu Ser Leu £»he Glu Ala Leu Gly Tyr Val Asp Asp 
40 45 50 55 

CAG CTG TTC GTG TTC TAT GAT GAT GAG AGT CGC CGT GTG GAG CCC CGA 3898 
Gin Leu Phe Val Phe Tyr Asp Asp Glu Ser Arg Arg Val Glu Pro Arg 
60 65 70 

ACT CCA TGG GTT TCC AGT AGA ATT TCA AGC CAG ATG TGG CTG CAG CTG 394 6 

Thr Pro Trp Val Ser Ser Arg lie Ser Ser Gin Met Trp Leu Gin Leu 
75 80 as 

AGT CAG AGT CTG AAA GGG TGG GAT CAC ATG TTC ACT GTT GAC TTC TGG 3994 
Ser Gin Ser Leu Lys Gly Trp Asp His Met Phe Thr Val Asp Phe Trp 
90 95 100 

ACT ATT ATG GAA AAT CAC AAC CAC AGC AAG G GTATGTGGAG AGGGGGCCTC 4045 
Thr lie Met Glu Asn His Asn His Ser Lys 
105 110 

ACCTTCCTGA GGTTGTCAGA GCTTTTCATC TTTTCATGCA TCTTGAAGGA AACAGCTGGA 4105 

AGTCTGAGGT CTTGTGGGAG CAGGGAAGAG GGAAGGAATT TGCTTCCTGA GATCATTTGG 4165 

TCCTTGGGGA TGGTGGAAAT AGGGACCTAT TCCTTTGGTT GCAGTTAACA AGGCTGGGGA 4225 

TTTTTCCAG AG TCC CAC ACC CTG CAG GTC ATC CTG GGC TGT GAA ATG 4272 
Glu Ser His Thr Leu Gin Val lie Leu Gly Cys Glu Met 
115 120 125 

GAA GAA GAC AAC AGT ACC GAG GGC TAC TGG AAG TAC GGG TAT GAT GGG 4 320 
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Gln Glu Asp Asn Ser Tnr Glu Gly Tyr Trp Lys Tyr Gly Tyr Asp Gly 
130 135 140 

CAG GAC CAC CTT GAA TTC TGC CCT GAC ACA CTG GAT TGG AGA GCA GCA 4 368 

Gin Asp His Leu Glu Phe CyS Pro Asp Thr Leu Asp Trp Arg Ala Ala 
145 150 155 

GAA CCC AGG GCC TGG CCC ACC AAG CTG GAG TGG GAA AGG CAC AAG ATT 4 416 

Glu Fro Arg Ala Trp Pro Thr Lys Leu Glu Trp Glu Arg His Lys lie 
160 165 170 

CGG GCC AGG CAG AAC AGG GCC TAC CTG GAG AGG GAC TGC CCT GCA CAG 4 464 

Arg Ala Arg Gin Asn Arg Ala Tyr Leu Glu Arg Asp Cys Pro Ala Gin 
175 180 1S5 190 



CTG CAG CAG TTG CTG GAG CTG GGG AGA GGT GTT TTG GAC CAA CAA G 4 510 

Leu Gin Gin Leu Leu Glu Leu Gly Arg Gly Val Leu Asp Gin Gin 





195 




200 




205 




GTATGGTGGA 


AACAC ACT T C 


TGCCCCTATA 


CTCTAGTGGC 


AGAGTGGAGG 


AGGTTGCAGG 


4570 


GCACGGAATC 


CCTGGTTGGA 


GTTTCAGAGG 


TGGCTGAGGC 


TGTGTGCCTC 


TCCAAATTCT 


4630 


CGGAAGGGAC 


TTTCTCAATC 


CTAGAGTCTC 


TACCTTATAA 


TTGAGATGTA 


TGAGACAGCC 


4690 


ACAAGTCATG 


GGTTTAATTT 


CTTTTCTCCA 


TGCATATGGC 


TCAAAGGGAA 


GTGTCTATGG 


4750 


CCCTTGCTTT 


TTATTTAACC 


AATAATCTTT 


TGTATATTTA 


TACCTGTTAA 


AAATTCAGAA 


4810 


ATGTCAAGGC 


CGGGCACGGT 


GGCTCACCCC 


TGTAATCCCA 


GCACTTTGGG 


AGGCCGAGGC 


4870 


GGGTGGTCAC 


AAGGTCAGGA 


GTT T GAGACC 


AGCCTGACCA 


ACATGGTGAA 


ACCCGTCTCT 


4930 


AAAAAAATAC 


AAAAATTAGC 


TGGTCACAGT 


CATGCGCACC 


TGTAGTCCCA 


GCTAATTGGA 


4990 


AGGCTGAGGC 


AGGAGCATCG 


CTTGAACCTG 


GGAAGCGGAA 


GTTGCACTGA 


GCCAAGATCG 


5050 


CGCCACTGCA 


CTCCAGCCTA 


GGCAGCAGAG 


TGAGACTCCA 


TCTTAAAAAA 


AAAAAAAAAA 


5110 


AAAAAAAGAG 


AATTCAGAGA 


TCTCAGCTAT 


CATATGAATA 


CCAGGACAAA 


ATATCAAGTG 


5170 


AGGCCACTTA 


TCAGAGTAGA 


AGAATCCTTT 


AGGTTAAAAG 


TTTCTTTCAT 


AGAACATAGC 


5230 


AATAATCACT 


GAAGCTACCT 


ATCTTACAAG 


TCCGCTTCTT 


ATAACAATGC 


CTCCTAGGTT 


5290 


GACCCAGGTG 


AAACTGACCA 


TCTGTATTCA 


ATCATTTTCA 


ATGCACATAA 


AGGGCAATTT 


5350 


TATCTATCAG 


AACAAAGAAC 


ATGGGTAACA 


GAT AT GT ATA 


TTTACATGTG 


AGGAGAACAA 


5410 


GCTGATCTGA 


CTGCTCTCCA 


AGTGACACTG 


TGTTAGAGTC 


CAATCTTAGG 


ACACAAAATG 


5470 


GTGTCTCTCC 


TGTAGCTTGT 


TTTTTTCTGA 


AAAGGGTATT 


TCCTTCCTCC 


AACCTATAGA 


5530 


AGGAAGTGAA 


AGTTCCAGTC 


TTCCTGGCAA 


GGGTAAACAG 


ATCCCCTCTC 


CTCATCCTTC . 


5590 


CTCTTTCCTG 


TCAAG TG CCT CCT TTG 
Val Pro Pro Leu 


GTG AAG GTG ACA CAT CAT GTG ACC 
Val Lys Val Thr His His Val Thr 


5640 



210 215 

TCT TCA GTG ACC ACT CTA CGG TGT CGG GCC TTG AAC TAC TAC CCC CAG 5688 
Ser Ser Val Thr Thr Leu Arg Cys Arg Ala Leu Asn Tyr Tyr Pro Gin 
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220 225 230 



AAC 

Asn 


ATC 
He 
235 


ACC 
Thr 


ATG 
Met 


AAG 
Lys 


TGG 
Trp 


CTG 
Leu 
240 


AAG 
Lys 


GAT 
Asp 


AAG 
Lys 


CAG 
Gin 


CCA 

Pro 
245 


ATG 
Met 


GAT 
Asp 


GCC 
Ala 


AAG 
Lys 


5736 


GAG 
Glu 


TTC 
Phe 


GAA 
Glu 


CCT 
Pro 


AAA 
Lys 


GAC 
Asp 


GTA 
Val 


TTG 
Leu 


CCC 
Pro 


AAT 
Asn 


GGG 
Gly 


GAT 
Asp 


GGG 
Gly 


ACC 
Thr 


TAC 
Tyr 


CAG 
Gin 


5784 


GGC 
Gly 


TGG 


ATA 

He 


ACC 
Thr 


TTG 
Leu 
270 


GCT 
Ala 


GTA 
Val 


CCC 
Pro 


CCT 
Pro 


GGG 
Gly 
275 


GAA 
Glu 


GAG 
Glu 


CAG 
Gin 


AGA 
Arg 


TAT 
Tyr 
280 


ACG 
Thr 


5832 


TGC 
Cys 


CAG 
Gin 


GTG 
Val 


GAG 
Glu 
285 


CAC 
Hi 3 


CCA 
Pro 


GGC 
Gly 


CTG 
Leu 


GAT 
Asp 
290 


CAG 
Gin 


CCC 
Pro 


CTC 
Leu 


ATT 
lie 


GTG 
Val 
295 


ATC 
He 


TGG G 
Trp 


5881 



GTATGTGACT GATGAGAGCC AGGAGCTGAG AAAATCTATT GGGGGTTGAG AGGAGTGCCT 5941 

GAGGAGGTAA TTATGGCAGT GAGATGAGGA TCTGCTCTTT GTTAGGGGGT GGGCTGAGGG 6001 

TGGCAATCAA AGGCTTTAAC TTGCTTTTTC TGTTTTAG AG CCC TCA CCG TCT 6053 

Glu Pro Ser Pro Ser 
300 

GGC ACC CTA GTC ATT GGA GTC ATC AGT GGA ATT GCT GTT TTT GTC GTC 6101 
Gly Thr Leu Val He Gly Val He Ser Gly He Ala Val Phe Val val 
305 310 315 

ATC TTG TTC ATT GGA ATT TTG TTC ATA ATA TTA AGG AAG AGG CAG GGT 614 9 

He Leu Phe He Gly He Leu Phe He He Leu Arg Lys Arg Gin Gly 



320 




325 




330 






TCA A GTGAGTAGGA ACAAGGGGGA AGTCTCTTAG TACCTCTGCC CCAGGGCACA 

Ser 

335 


6203 


GTGGGAAGAG 


GGGCAGAGGG 


GATCTGGCAT 


CCATGGGAAG 


CATTTTTCTC 


AT T TAT AT T C 


6263 


TTTGGGGACA 


CCAGCAGCTC 


CCTGGGAGAC 


AGAAAATAAT 


GGTTCTCCCC 


AGAATGAAAG 


6323 


TCTCTAATTC 


AACAAACATC 


TTCAGAGCAC 


CTACTATTTT 


GCAAGAGCTG 


TTTAAGGTAG 


6333 


TACAGGGGCT 


TTGAGGTTGA 


GAAGTCACTG 


TGGCTATTCT 


CAGAACCCAA 


ATCTGGTAGG 


6443 


GAATGAAATT 


GATAGCAAGT 


AAATGTAGTT 


AAAGAAGACC 


CCATGAGGTC 


CTAAAGCAGG 


6503 


CAGGAAGCAA 


ATGCTTAGGG 


TGTCAAAGGA 


AAGAATGATC 


ACATTCAGCT 


GGGGATCAAG 


6563 


ATAGCCTTCT 


GGATCTTGAA 


GGAGAAGCTG 


GATTCCATTA 


GGTGAGGTTG 


AAGATGATGG 


6623 


GAGGTCTACA 


CAGACGGAGC 


AACCATGCCA 


AGTAGGAGAG 


TATAAGGCAT 


ACTGGGAGAT 


6683 


TAGAAATAAT 


TACTGTACCT 


TAACCCTGAG 


TTTGCGTAGC 


TATCACTCAC 


CAATTATGCA 


6743 


TTTCTACCCC 


CTGAACATCT 


GTGGTGTAGG 


GAAAAGAGAA 


TCAGAAAGAA 


GCCAGCTCAT 


6803 


ACAGAGTCCA 


AGGGTCTTTT 


GGGATATTGG 


GTTATGATCA 


CTGGGGTGTC 


ATTGAAGGAT 


6863 


CCTAAGAAAG 


GAGGACCACG 


ATCTCCCTTA 


TATGGTGAAT 


GTGTTGTTAA 


GAAGTTAGAT 


6923 



PAGE 28/74 1 RCVDAT 4/24/2006 1:37:38 PM [Eastern Daylight Tone] ' SVRUSPT0-EFXRF4/13 1 DMS:2738300 > CSID:6507393900 * DURATION (mrrvss):16-20 



APR-24-2006 t 0:45AM FROM- J ONES DAY 



8507393900 



T-631 P. 029/074 F- 



GAGAGGT GAG GAGACCAGTT AGAAAGCCAA TAAGCATTTC CAGATGAGAG ATAATGGTTC 6983 

TTGAAATCCA ATAGTGCCCA GGTCTAAATT GAGATGGGTG AATGAGGAAA ATAAGGAAGA 7 043 

GAGAAGAGGC AAGATGGTGC CTAGGTTTGT GATGCCTCTT TCCTGGGTCT CTTGTCTCCA 7X03 

CAG GA GGA GCC ATG GGG CAC TAC GTC TTA GCT GAA CGT GAG 7144 
Arg Gly Ala Met Gly His Tyr Val Leu Ala Glu Arg Glu 
340 345 

TGACACGCAG CCTGCAGACT CACTGTGGGA AGGAGACAAA ACTAGAGACT CAAAGAGGGA 7204 

GTGCATTTAT GAGCTCTTCA TGTTTCAGGA GAGAGTTGAA CCTAAACATA GAAATTGCCT 7264 

GACGAACTCC TTGATTTTAG CCTTCTCTGT TCATTTCCTC AAAAAGATTT CCCCATTTAG 7 32 4 

GTTTCTGAGT TCCTGCATGC CGGTGATCCC TAGCTGTGAC CTCTCCCCTG GAACTGTCTC 7 384 

TCATGAACCT CAAGCTGCAT CTAGAGGCTT CCTXCATTTC CTCCGTCACC TCAGAGACAT 7 44 4 

ACACCTATGT CATTTCATTT CCTATTTTTG GAAGAGGACT CCTTAAATTT GGGGGACTTA 7504 

CATGATTCAT TTTAACATCT GAGAAAAGCT TTGAACCCTG GGACGTGGCT AGTCATAACC 7564 

TTACCAGATT TTTACACATG TATCTATGCA TTTTCTGGAC CCGTTCAACT TTTCCTTTGA 7 624 

ATCCTCTCTC TGTGTTACCC AGTAACTCAT CTGTCACCAA GCCTTGGGGA TTCTTCCATC 7664 

TGATTGTGAT GTGAGTTGCA CAGCTATGAA GGCTGTACAC TGCACGAATG GAAGAGGCAC 77 4 4 

CTGTCCCAGA AAAAGCATCA TGGCTATCTG TGGGTAGTAT GATGGGTGTT TTTAGCAGGT 7804 

AGGAGGCAAA TATCTTGAAA GGGGTTGTGA AGAGGTGTTT TTTCTAATTG GCATGAAGGT 7864 

GTCATACAGA TTTGCAAAGT TTAATGGTGC CTTCATTTGG GATGCTACTC TAGTATTCCA 7924 

GACCTGAAGA ATCACAATAA TTTTCTACCT GGTCTCTCCT TGTTCTGATA ATGAAAATTA 7 984 

TGATAAGGAT GATAAAAGCA CTTACTTCGT GTCCGACTCT TCTGAGCACC TAC TT AC ATG 804 4 

CATTACTGCA TGCACTTCTT ACAATAATTC TATGAGATAG GTACTATTAT CCCCATTTCT 3104 

TTTTTAAATG AAGAAAGTGA AGTAGGCCGG GCACGGTGGC TCACGCCTGT AATCCCAGCA 8164 

CTTTGGGAGG CCAAAGCGGG TGGATCACGA GGTGAGGAGA TCGAGACCAT CCTGGCTAAC 8224 

ATGGTGAAAC CCCATCTCTA ATAAAAATAC AAAAAATTAG CTGGGCGTGG TGGCAGACGC 8284 

CTGTAGTCCC AGCTACTCGG AAGGCTGAGG CAGGAGAATG GCATGAACCC AGGAGGCAGA 834 4 

GCTTGCAGTG AGCCGAGTTT GCGCCACTGC ACTCCAGCCT AGGTGACAGA GTGAGACTCC 8404 

ATCTCAAAAA AATAAAAATA AAAATAAAAA AATGAAAAAA AAAAQAAAGT GAAGTATAGA 8464 

GTATCTCATA GTTTGTCAGT GATAGAAACA GGTTTCAAAC TCAGTCAATC TGACCGTTTG 8524 

ATACATCTCA GACACCACTA CATTCAGTAG TTTAGATGCC TAGAATAAAT AGAGAAGGAA 8584 

GGAGATGGCT CTTCTCTTGT CTCATTGTGT TTCTTCTGAG TGAGCTT6AA TCACATGAAG 864 4 

GGGAACAGCA GAAAACAACC AACTGATCCT CAGCTGTCAT GTTTCCTTTA AAAGTCCCTG 8704 
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AAGGAAGGTC 


CTGGAATGTG 


ACTCCCTTGC 


TCCTCTGTTG 


CTCTCTTTGG 


CATTCATTTC 


8764 


TTT6GACCCT 


ACGCAAGGAC 


TGTAATTGGT 


GGGGACAGCT 


AGTGGCCCTG 


CTGGGCTTCA 


8824 


CACACGGTGT 


CCTCCCTAGG 


CCAGTGCCTC 


TGGAGTCAGA 


ACTCTGGTGG 


TATTTCCCTC 


8884 


AA.T GAAGT GCa 


A\D L nJVjL. 1 C 1 


C X Wii ill «A 


r" , 'n r !T^r^ r Pxvp7\ tv 


1 uuniibk/^nlv 


f zi n f7 r p/^^r , TT 

ktAno Jl x x 




AGAGGATGCC 


CAGG 1 C L, J. 1 <s. 


(,A1 btjAbUCA 


U 1 ^atjtjLs 1 1 LrLr 


bbl urWiWil 1 


H21Z121Z1Z171Z1Z1I1 




T CT AACCAGG 


AC- A 1 1 CAGGA 


All Ljv_I AbAI 


TCTGOjfcjAAAl 




lull n— A/vrlri\3 




AGTCTTTTTT 


TTTTTTTTGA 


unli 1 \* lr\i 1 Va 


LUCAUl»L> 1 KZKZ 


AcJl OlyiiAi ITU 


r 1 zi m fill T* ^"T r 1 ^2 


9124 


GCTCACTOTA 
GCTGGGATTA 


Aut 1 w 1 uLL 1 

CAGGCGTGCA 


LL#UAuu 1 1 t*A 

CCACCATGCC 


nuLrunl 1 Ul L. 
CGGCTAATTT 


TTGTATTTTT 


/•iffl^/* 7V A 21 
^ 1 ^ VW>nM-u 1 >L 

AGTAGAGACA 


9244 


GGGTTTCACC 


ATGTTGGCCA 


GGCTGGTCTC 


gaactctcct 


GACCTCGTGA 


TCCGCCTGCC 


9304 


TCGGCCTCCC 


AAAGTGCTGA 


GATTACAGGT 


GTGAGCCACC 


CTGCCCAGCC 


GTCAAAAGAG 


9364 


TCTTAATATA 


TATATCCAGA 


TGGCATGTG? 


TTACTTTATG 


TTACTACATG 


CACTTGGCTG 


9424 


CATAAATGTG 


GTACAAGCAT 


TCTGTCTTGA 


AGGGCAGGTG 


CTTCAGGATA 


CCATATACAG 


9484 


/"• n* a & a m t 1 








Inl w 1 v*il w X 


<wli<wll X inn 




ACCATTTTCT 


Tl 1 1 1 lulbb 


T T 21 (7 21 & 21 IL 
1 1 Av»**AAAlp 1 


T ATfZT A fZA a a 


2i a zi fz m & zv zi m 


1 urll 1 inwL 


9604 


T CAT TGT AG A 


AAAG C T AT AA 


AA 1 1>AA 1 At- A 


ATTAAAGCTu 


TTATTTAAI 1 






AAACTATTAA 


CAACTTGTCT 


ATTACCTGTT 


n/^mn mm f» mm ^ 

AG TAT TAT T G 


mm^^9% mm 7K. ?i n 

TTGCATTAAA 


AAT GCATATA 




TGTGTATATA 


CTTAATCGCT 


TTGTCATTTT 


GGAGACATTT 


ATTTTGCTTC 


TAATTTCTTT 


9844 


ACATTTTGTC 


TTACGGAATA 


TTTTCATTCA 


ACTGTGGTAG 


CCGAATTAAT 


CGTGTTTCTT 


9904 


CACTCTAGGG 


ACATTGTCGT 


CTAAGTTGTA 


AGACATTGGT 


TATTTTACCA 


GCAAACCATT 


9964 


CTGAAAGCAT 


ATGACAAATT 


ATTTCTCTCT 


TAATATCTTA 


CTATACTGAA 


AGCAGACTGC 


10024 


TATAAGGCTT 


CACTTACTCT 


TCTACCTCAT 


AAGGAATATG 


TTACAATTAA 


TTTATTAGGT 


10084 


AAGCATTTGT 


TTTATATTGG 


TTTTATTTCA 


CCTGGGCTGA 


GATTTCAAGA 


AACACCCCAG 


10144 


TCTTCACAGT 


AACACATTTC 


ACTAACACAT 


T TACT AAACA 


TCAGCAACTG 


TGGCCTGTTA 


102O4 


ATTTTTTTAA 


TAGAAATTTT 


AAGTCCTCAT 


TTTCTTTCGG 


TGTTTTTTAA 


GCTTAATTTT 


10264 


TCTGGCTTTA 


TTCATaAATT 


CTTAAGGTCA 


ACTACATTTG 


AAAAATCAAA 


GACCTGCATT 


10324 


TTAAATTCTT 


ATTCACCTCT 


GGCAAAACCA 


TTCACAAACC 


ATGGTAGTAA 


AGAGAAGGGT 


10384 


GACACCTGGT 


GGCCATAGGT 


AAATGTACCA 


CGGTGGTCCG 


GTGACCAGAG 


ATGCAGCGCT 


10444 


GAGGGTTTTC 


CTGAAGGTAA 


AGGAATAAAG 


AATGGGTGGA 


GGGGCGTGCA 


CTGGAAATCA 


10504 


CTTGTAGAGA 


AAAGCCCCTG 


AAAATTTGAG 


AAAACAAACA 


AGAAACTACT 


TACCAGCTAT 


10564 
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TTGAAtTGCT GGAATCACAG GCCATTGCTG AGCTGCCTGA ACT GG GAAC A CAACAGAAGG 10624 

AAAACAAACC ACTCTGATAA TCATTGAGTC AAGTACAGCA GGTGATTGAG GACTGCTGAG 10684 

AGGTACAGGC CAAAATTCTT ATGTTGTATT ATAATAATGT CATCTTATAA TACTGTCAGT 10744 

ATTTTATAAA ACATTCTTCA CAAACTCACA CACATTTAAA AACAAAACAC TGTCTCTAAA 10804 

ATCCCCAAAT TTTTCATAAA C 10825 

(2) INFORMATION FOR SEQ ID NO: 6; 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Gly Pro Arg Ala Arg Pro Ala Leu Leu Leu Leu Met Leu Leu Gin 
15 10 15 

Thr Ala Val Leu Gin Gly Arg Leu Leu Arg Ser His Ser Leu His Tyr 
20 25 30 

. Leu Phe Met Gly Ala Ser Glu Gin Asp Leu Gly Leu Ser Leu Phe Glu 
35 40 45 

Ala Leu Gly Tyr Val Asp Asp Gin Leu Phe Val Phe Tyr Asp Asp Glu 
50 55 60 

Ser Arg Arg Val Glu Pro Arg Thr Pro Trp Val Ser Ser Arg lie Ser 
65 70 75 80 

Ser Gin Met Trp Leu Gin Leu Ser Gin Ser Leu Lys Gly Trp Asp His 
85 90 95 

Met Phe Thr Val Asp Phe Trp Thr lie Met Glu Asn His Asn His Ser 
100 105 110 

Lys Glu Ser His Thr Leu Gin Val lie Leu Gly Cys Glu Met Gin Glu 
115 120 125 

Asp Asn Ser Thr Glu Gly Tyr Trp Lys Tyr Gly Tyr Asp Gly Gin Asp 
130 135 140 

His Leu Glu Phe Cys Pro Asp Thr Leu Asp Trp Arg Ala Ala Glu Pro 
145 150 155 160 

Arg Ala Trp Pro Thr Lys Leu Glu Trp Glu Arg His Lys lie Arg Ala 
165 170 175 

Arg Gin Asn Arg Ala Tyr Leu Glu Arg Asp Cys Pro Ala Gin Leu Gin 
180 IBS 190 

Gin Leu Leu Glu Leu Gly Arg Gly Val Leu Asp Gin Gin Val Pro Pro 
195 200 205 
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Leu Val Lys Val Thr His His Val Thr Ser Ser Val Thr Thr Leu Arg 
210 215 220 

Cys Arg Ala Leu Asn Tyr Tyr Pro Gin Asn He Thr Mec Lys Trp Leu 
225 230 235 240 

Lys Asp Lys Gin Pro Met Asp Ala Lys Glu Phe Glu Pro Lys Asp Val 
245 250 255 

Leu Pro Asn Gly Asp Gly Thr Tyr Gin Gly Trp He Thr Leu Ala Val 
260 265 270 

Pro Pro Gly Glu Glu Gin Arg Tyr Thr Cys Gin Val Glu His Pro Gly 
275 280 285 

Leu Asp Gin Pro Leu He Val He Trp Glu Pro Ser Pro Ser Gly Thr 
290 295 300 

Leu Val He Gly Val He Ser Gly He Ala Val Phe Val Val He Leu 
305 310 315 320 

Phe He Gly He Leu Phe He He Leu Arg Lys Arg Gin Gly Ser Arg 
325 330 335 

Gly Ala Met Gly His Tyr Val Leu Ala Glu Arg Glu 
340 345 



(2) INFORMATION FOR SEQ ID N0:7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10825 base pairs 

(B) type: nucleic acid 

(C) STRANDEDNESS : single 

( D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAMfi/KEY; CDS 

(B) LOCATION: j oin ( 361 . . 436, 3762.. 4025, 4235- -4510, 5606.. 5881, 

6040. .6153, 7107. .7147) 
(D) OTHER INFORMATION: /product* 3 "Hereditary Hemochromatosis 

(HHJ protein containing both the 24dl 
and 24d2 mutations" 

/note= "Hereditary Hemochromatosis £HH) 
gene containing a combination of both 
24dl and 24d2 alleles" 

(iX) FEATURE: 

(A) NAME /KEY: - 

(B) LOCATION: 140.. 7319 

(D) OTHER INFORMATION: /note= "start and stop positions for 

cDNA containing a combination of both 
24dl and 24d2 alleles 
(SEQ ID NO:12) " 

(ix) FEATURE: 

(A) NAME/KSY: - 

(B) LOCATION: 3852.. 3891 
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(D) OTHER INFORMATION: /note= "start and stop positions for 

genomic sequence surrounding variant 
for 24d2(G) allele £SEQ ID NO: 42) " 

(XJC) FEATURE: 

(A) NAME /KEY: - 

(B) LOCATION: 5507.. 6023 

(D) OTHER INFORMATION: /note= "start and stop positions for 

genomic sequence surrounding variant 
for 24dl(A) allele (SEQ ID NO:21)" 

(ix) FEATURE: 

(A) NAME/KEY: allele 

(B) LOCATION: replace (3872 , "g") 

(D) OTHER INFORMATION: /pheno'cype- "Hereditary Hemochromatosis 

(HH) " 

/label= 24d2 

<ix) FEATURE: 

(A) NAME /KEY: allele 

(B) LOCATION: replace (5834, "a") 

(D) OTHER INFORMATION; /phenotype= "Hereditary Hemochromatosis 

(HH) " 

/label= 24dl 



(*i) SEQUENCE DESCRIPTION: SEQ ID NO:7: 



TCTAAGGTTG 


AGATAAAATT 


TTTAAATGTA 


TGATTGAATT 


TTGAAAATCA 


TAAATATTTA 


60 


AATATCTAAA 


GTTCAGATCA 


GAACATTGCG 


AAGCTACTTT 


CCCCAATCAA 


CAACACCCCT 


120 


TCAGGATTTA 


AAAACCAAGG 


GGGACACTGG 


ATCACCTAGT 


GTTTCACAAG 


CAGGTACCTT 


180 


CTGCTGTAGG 


AGAGAGAGAA 


CTAAAGTTCT 


GAAAGACCTG 


TTGCTTTTCA 


CCAGGAAGTT 


240 


TTACTGGGCA 


TCTCCTGAGC 


CTAGGCAATA 


GCTGTAGGGT 


GACTTCTGGA 


GCCATCCCCG 


300 


TTTCCCCGCC 


CCCCAAAAGA 


AGCGGAGATT 


TAACGGGGAC 


GTGCGGCCAG 


AGCTGGGGAA 


360 


ATG GGC CCG CGA GCC AGG CCG GCG CTT CTC CTC 
Met Gly Pro Arg Ala Aro; Pro Ala Leu Leu Leu 


CTG ATG CTT TTG CAG 
Leu Met Leu Leu Gin 


408 



15 10 15 

ACC GCG GTC CTG CAG GGG CGC TTG CTG C GTGAGTCCGA GGGCTGCGGG 456 
Tnr Ala Val Leu Gin Gly Arg Leu Leu 





20 




25 








CGAACTAGGG 


GCGCGGCGGG 


GGTGGAAAAA 


TCGAAACTAG 


CTTTTTCTTT 


GCGCTTGGGA 


516 


GTTTGCTAAC 


TTTGGAGGAC 


CTGCTCAACC 


CTATCCGCAA 


GCCCCTCTCC 


CTACTTTCTG 


576 


CGTCCAGACC 


CCGTGAGGGA 


GTGCCTACCA 


CTGAACTGCA 


GATAGGGGTC 


CCTCGCCCCA 


636 


GGACCTGCCC 


CCTCCCCCGG 


CTGTCCCGGC 


TCTGCGGAGT 


GACTTTTGGA 


ACCGCCCACT 


696 


CCCTTCCCCC 


AACTAGAATG 


CTTTTAAATA 


AATCTCGTAG 


TTCCTCACTT 


GAGCTGAGCT 


756 


AAGCCTGGGG 


CTCCTTGAAC 


CTG6AACTCG 


GGTTTATTTC 


CAATGTCAGC 


TGTGCAGTTT 


816 


TTTCCCCAGT 


CATCTCCAAA 


CAGGAAGTTC 


TTCCCTGAGT 


GCTTGCCGAG 


AAGGCT6AGC 


876 
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AAACCCACAG CAGGATCCGC ACGGGGTTTC CACCTCAGAA CGAATGCGTT GGGCGGTGGG 936 

GGCGCGAAAG AGTGGCGTTG GGGATCTGAA TTCTTCACCA TTCCACCCAC TTTTGGTGAG 996 

ACCTGGGGTG GAGGTCTCTA GGGTGGGAGG CTCCTGAGAG AGGCCTACCT CGGGCCTTTC 1056 

CCCACTCTTG GCAATTGTTC TTTTGCCTGG AAAATTAAGT ATATGTTAGT TTTGAACGTT 11 16 

TGAACTGAAC AATTCTCTTT TCGGCTAGGC TTTATTGATT TGCAATGTGC TGTGTAATTA 117 6 

AGAGGCCTCT CTACAAAGTA CTGATAATGA ACATGTAAGC AATGCACTCA CTTCTAAGTT 1236 

AC ATT CAT AT CTGATCTTAT TTGATTTTCA CTAGGCATAG GGAGGTAGGA GCTAATAATA 1296 

CGTTTATTTT ACTAGAAGTT AACTGGAATT CAGATTATAT AACTCTTTTC AGGTTACAAA 1356 

GAACATAAAT AATCTGGTTT TCTGATGTTA TTTCAAGTAC TACAGCTGCT TCTAATCTTA 1416 

GTTGACAGTG ATTTTGCCCT GTAGTGTAGC ACAGTGTTCT GTGGGTCACA CGCCGGCCTC 14 7 6 

AG C AC AG C AC TTTGAGTTTT GGTACTACGT GTATCCACAT TTTACACATG ACAAGAATGA 1536 

GGCATGGCAC GGCCTGCTTC CTGGCAAATT TATTCAATGG TACACTGGGC TTTGGTGGCA 1596 

GAGCTCATGT CTCCACTTCA TAGCTATGAT TCTTAAACAT CACACTGCAT TAGAGGTTGA 1656 

ATAATAAAAT TTCATGTTGA GCAGAAATAT TCATTGTTTA CAAGTGTAAA TGAGTCCCAG 1716 

CCATGTGTTG CACTGTtCAA GCCCCAAGGG AGAGAGCAGG GAAACAAGTC TTTACCCTTT 1776 

GATATTTTGC ATTCTAGTGG GAGAGATGAC AATAAGCAAA TGAGCAGAAA GATATACAAC 1836 

ATCAGGAAAT CATGGGTGTT GTGAGAAGCA GAGAAGTCAG GGCAAGTCAC TCTGGGGCTG 18 96 

ACACTTGAGC AGAGACATGA AGGAAATAAG AATGATATTG ACTGGGAGCA GTATTTCCCA 1956 

GGCAAACTGA GTGGGCCTGG CAAGTTGGAT TAAAAAGCGG GTTTTCTCAG CACTACTCAT 2016 

GTGTGTGTGT GTGGGGGGGG GGGGCGGCGT GGGGGTGGGA AGGGGGACTA CCATCTGCAT 207 6 

GTAGGATGTC TAGCAGTATC CTGTCCTCCC TACTCACTAG GTGCTAGGAG CACTCCCCCA 2136 

GTCTTGACAA CCAAAAATGT CTCTAAACTT TGCCACATGT CACCTAGTAG ACAAACTCCT 2l96 

GGTTAAGAAG CTCGGGTTGA AAAAAATAAA CAAGTAGTGC TG GGGAGT AG AGGCCAAGAA 2256 

GTAGGTAATG GGCTCAGAAG AGGAGCCACA AACAAGGTTG TGCAGGCGCC TGTAGGCTGT 2316 

GGCTGCTATG TGGAAAGCAG AATGAAGGGA GCAACAGTAA AAGCAGGGAG CCCAGCCAGG 2436 

AAGCTGTTAC ACAGTCCAGG CAAGAGGTAG TGGAGTGGGC TGGGTGGGAA CAGAAAAGGG 2496 

AGTGACAAAC CATTGTCTCC TGAATATATT CTGAAGGAAG TTGCTGAAGG ATTCTATGTT 2556 

GTGTGAGAGA AAGAGAAGAA TTGGCTGGGT GTAGTAGCTC ATGCCAAGGA GGAGGCCAAG 2616 

GAGAGCAGAT TCCTGAGCTC AGGAGTTCAA GACCAGCCTG GGCAACACAG CAAAACCCCT 267 6 

TCTCTACAAA AAATACAAAA ATTAGCTGGG TGTGGTGGCA TGCACCTGTG ATCCTAGCTA 2736 



PAGE 34/74 * RCVD AT 4/24/2006 1:37:38 PM [Eastern Daylight Time] * SVRUSPTO-EFXRF-5/13 1 DNiS:2738300 * CSID:W07393900 1 DURATION (mnvss):16-20 



APR-24-2006 10:47AM FROM- J ONES DAY 



6507393900 



T-631 P. 035/074 F- 



CTCGGGAGGC TGAGGTGGAG GGTATTGCTT GAGCCCAGGA AGTTGAGGCT GCAGTGAGCC 279 6 

ATGACTGTGC CACTGTACTT CAGCCTAGGT GACAGAGCAA GACCCTGTCT CCCCTGACCC 28S6 

CCTGAAAAAG AGAAGAGTTA AAGTTGACTT TGTTCTTTAT TTTAATTTTA TTGGCCTGAG 2 91 6 

CAGTGGGGTA ATTGGCAATG CCATTTCTGA GATGGT GAAG GCAGAGGAAA GAGCAGTTTG 297 6 

GGGTAAATCA AGGATCTGCA TTTGGGACAT GTTAAGTTTG AGATTCCAGT CAGGCTTCCA 303 6 

AGTGGTGAGG CCACATAGGC AGTTCAGTGT AAGAATTCAG GACCAAGGCT GGGCACGGTG 3096 

GCTCACTTCT GTAATCCCAG CACTTTGGTG GCTGAGGCAG GTAGATGATT TGAGGTCAGG 315 6 

AGTTTGAGAC AAGCTTGGCC AACATGGTGA AACCCCATGT CTACTAAAAA TACAAAAATT 3216 

AGCCTGGTGT GGTGGCGCAC GCCTATAGTC CCAGGTTXTC AGGAGGCTTA GGTAGGAGAA 3276 

TCCCTTGAAC CCAGGAGGTG CAGGTTGCAG TGAGCTGAGA TTGTGCCACT GCACTCCAGC 3336 

CTGGGTGATA GAGTGAGACT CTGTCTCAAA AAAAAAAAAA AAAAAAAAAA AAAAAACTGA 3396 

AGGAATTATT CCTCAGGATT TGGGTCTAAT TTGCCCTGAG CACCAACTCC TGAGTTCAAC 34 5 6 

TACCATGGCT AGACACACCT TAACATTTTC TAGAATCCAC CAGCTTTAGT GGAGTCTGTC 3516 

TAATCATGAG TATTGGAATA GGATCTGGGG GCAGTGAGGG GGTGGCAGCC ACGTGTGGCA 3576 

GAGAAAAGCA CACAAGGAAA GAGCACCCAG GACTGTCATA TGGAAGAAAG ACAGGACTGC 3 636 

AACTCACCCT TCACAAAATG AGGACCAGAC ACAGCTGATG GTATGAGTTG ATGCAGGTGT 3696 

GTGGAGCCTC AACATCCTGC TCCCCTCCTA CTACACATGG TTAAGGCCTG TTGCTCTGTC 3756 

TCCAG GT TCA CAC TCT CTG CAC TAG CTC TTC ATG GGT GCC TCA GAG 3802 
Arg Ser His Ser Leu His Tyr Leu Phe Met Gly Ala Ser Glu 
30 35 

CAG GAC CTT GGT CTT TCC TTG TTT GAA GCT TTG GGC TAG GTG GAT GAC 3850 
Gin Asp Leu Gly Leu Ser Leu Phe Glu Ala Leu Gly Tyr Val Asp Asp 
40 45 - 50 55 

CAG CTG TTC GTG TTC TAT GAT GAT GAG AGT CGC CGT GTG GAG CCC CGA 3898 
Gin Leu Phe Val Phe Tyr Asp Asp Glu Ser Arg Arg Val Glu Pro Arg 
60 65 70 

ACT CCA TGG GTT TCC AGT AGA ATT TCA AGC CAG ATG TGG CTG CAG CTG 394 6 
Thr Pro Trp Val Ser Ser Arg lie Ser Ser Gin Met Trp Leu Gin Leu 
75 80 85 

AGT CAG AGT CTG AAA GGG TGG GAT CAC ATG TTC ACT GTT GAC TTC TGG 3994 
Ser Gin Ser Leu Lys Gly Trp Asp His Met Phe Thr Val Asp Phe Trp 
90 95 100 

ACT ATT ATG GAA AAT CAC AAC CAC AGC AAG G GTATGTGGAG AGGGGGCCTC 4 045 
Thr lie Met Glu Asn His Asn Hi3 Ser Lys 
105 110 

ACCTTCCTGA GGTTGTCAGA GCT TTT CATC TTTTCATGCA TCTTGAAGGA AACAGCTGGA 4105 

AGTCTGAGGT CTTGTGGGAG CAGGGAAGAG GGAAGGAATT TGCTTCCTGA GATCATTTGG 4165 
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TCCTTGGGGA TGGTGGAAAT AGGGACCTAT TCCTTTGGTT GCAGTTAACA AGGCTGGGGA 4225 

TTTTTCCAG AG TCC CAC ACC CTG CAG GTC ATC CTG GGC TGT GAA ATG 4272 
Glu Ser His Thr Leu Gin Val lie Leu Gly Cys Glu Met 
115 120 125 

CAA GAA GAC AAC AGT ACC GAG GGC TAC TGG AAG TAC GGG TAT GAT GGG 4 320 

Gin Glu Agp Asn Ser Thr Glu Gly Tyr Trp Lys Tyr Gly Tyr Asp Gly 
130 135 140 

CAG GAC CAC CTT GAA TTC TGC CCT GAC ACA CTG GAT TGG AGA GCA GCA 4 366 

Gin Asp His Leu Glu Phe Cys Pro Asp Thr Leu Asp Trp Arg Ala Ala 
145 150 155 

GAA CCC AGG GCC TGG CCC ACC AAG CTG GAG TGG GAA AGG CAC AAG ATT 4 416 

Glu Pro Arg Ala Trp Pro Thr Lys Leu Glu Trp Glu Arg His Lys lie 
160 165 170 

CGG GCC AGG CAG AAC AGG GCC TAC CTG GAG AGG GAC TGC CCT GCA CAG 4 4 64 

Arg Ala Arg Gin Asn Arg Ala Tyr Leu Glu Arg Asp Cys Pro Ala Gin 
175 180 185 190 

CTG CAG CAG TTG CTG GAG CTG GGG AGA GGT GTT TTG GAC CAA CAA G 4 510 

Leu Glh Gin Leu Leu Glu Leu Gly Arg Gly Val Leu Asp Gin Gin 





195 




200 




205 




GTATGGTGGA 


AAC ACACTT C 


TGCCCCTATA 


CTCTAGTGGC 


AGAGTGGAGG 


AGGTTGCAGG 


4570 


GCACGGAATC 


CCTGGTTGGA 


GTTTCAGAGG 


TGGCTGAGGC 


TGTGTGCCTC 


TCCAAATTCT 


4 630 


GGGAAGGGAC 


TTTCTCAATC 


CTAGAGTCTC 


TACCTTATAA 


TTGAGATGTA 


TGAGACAGCC 


4690 


ACAAGTCATG 


GGTTTAATTT 


CTTTTCTCCA 


TGCATATGGC 


TCAAAGGGAA 


GTGTCTATGG 


4750 


CCCTTGCTTT 


TTATTTAACC 


AATAATCTTT 


TGTATATTTA 


TACCTGTTAA 


AAATTCAGAA 


4810 


ATGTCAAGGC 


CGGGCACGGT 


GGCTCACCCC 


TGTAATCCCA 


GCACTTTGGG 


AGGCCGAGGC 


4870 


GGGTGGTCAC 


AAGGTCAGGA 


GTTTGAGACC 


AGCCTGACCA 


ACATGGTGAA 


ACCCGTCTCT 


4930 


AAAAAAATAC 


AAAAATTAGC 


TGGTCACAGT 


CATGCGCACC 


TGTAGTCCCA 


GCTAATTGGA 


4990 


AGGCTGAGGC 


AGGAGCATCG 


CTTGAACCTG 


GGAAGCGGAA 


GTTGCACTGA 


GCCAAGATCG 


5050 


CGCCACTGCA 


CTCCAGCCTA 


GGCAGCAGAG 


TGAGACTCCA 


TCTTAAAAAA 


AAAAAAAAAA 


5110 


AAAAAAAGAG 


AATTCAGAGA 


TCTCAGCTAT 


CATATGAATA 


CCAGGACAAA 


ATATCAAGTG 


5170 


AGGCCACTTA 


T CAG AGT AGA 


AGAATCCTTT 


AGGTTAAAAG 


TTTCTTTCAT 


AGAACATAGC 


5230 


AATAATCACT 


GAAGCTACCT 


ATCTTACAAG 


TCCGCTTCTT 


ATAACAATGC 


CTCCTAGGTT 


5290 


GACCCAGGTG 


AAACTGACCA 


TCTGTATTCA 


ATCATTTTCA 


ATGCACATAA 


AGGGCAATTT 


5350 


TATCTATCAG 


AACAAAGAAC 


ATGGGTAACA 


GAT AT GT AT A 


TTTACATGTG 


AGGAGAACAA 


5410 


GCTGATCTGA 


CTGCTCTCCA 


AGTGAC ACT G 


TGTTAGAGTC 


CAATCTTAGG 


ACACAAAATG 


5470 


GTGTCTCTCC 


TGTAGCTTGT 


TTTTTTCTGA 


AAAGGGTATT 


TCCTTCCTCC 


AACCTATAGA 


5530 
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AGGAAGTGAA AGTTCCAGTC TTCCTGGCAA GGGTAAACAG ATCCCCTCTC CTCATCCTTC 5S90 

CTCTTTCCTG TCAAG TG CCT CCT TTG GTG AAG GTG ACA CAT CAT GTG ACC 5640 
Val Pro Pro Leu Val Lys Val Thr His His Val Thr 
210 215 

TCT TCA GTG ACC ACT CTA CGG TGT CGG GCC TTG AAC TAC TAC CCC CAG 5688 
Ser Ser Val Thr Thr Leu Arg Cys Arg Ala Leu Asn Tyr Tyr Pro Gin 
220 225 230 

AAC ATC ACC ATG AAG TGG CTG AAG GAT AAG CAG CCA ATG GAT GCC AAG 5736 
Asn lie Thr Met Lys Trp Leu Lys Asp Lys Gin Pro Met Asp Ala Lys 
235 240 245 

GAG TTC GAA CCT AAA GAC GTA TTG CCC AAT GGG GAT GGG ACC TAC CAG 5784 
Glu Phe Glu Pro Lys Asp Val Leu Pro Asn Gly Asp Gly Thr Tyr Gin 
250 255 260 265 

GGC TGG ATA ACC TTG GCT GTA CCC CCT GGG GAA GAG CAG AGA TAT ACG 5832 
Gly Trp lie Thr Leu Ala Val Pro Pro Gly Glu Glu Gin Arg Tyr Thr 
270 275 280 

TAC CAG GTG GAG CAC CCA GGC CTG GAT CAG CCC CTC ATT GTG ATC TGG G 5881 
Tyr Gin Val Glu His Pro Gly Leu Asp Gin Pro Leu lie Val lie Trp 
285 290 295 

GTATGTGACT GATGAGAGCC AGGAGCTGAG AAAATCTATT GGGGGTTGAG AGGAGTGCCT 5941 

GAGGAGGTAA TTATGGCAGT GAGATGAGGA TCTGCTCTTT GTTAGGGGGT GGGCTGAGGG 6001 

TGGCAATCAA AGGCTTTAAC TTGCTTTTTC TGTTTTAG AG CCC TCA CCG TCT 6053 

Glu Pro Ser Pro Ser 
300 

GGC ACC CTA GTC ATT GGA GTC ATC AGT GGA ATT GCT GTT TTT GTC GTC 6101 
Gly Thr Leu Val lie Gly Val lie Ser Gly lie Ala Val Phe Val Val 
305 310 315 

ATC TTG TTC ATT GGA ATT TTG TTC ATA ATA TTA AGG AAG AGG CAG GGT 6149 
He Leu Phe He Gly He Leu Phe He He Leu Arg Lys Arg Gin Gly 



320 




325 




330 






TCA A GTGAGTAGGA ACAAGGGGGA AGTCTCTTAG TACCTCTGCC CCAGGGCACA 
Ser 

335 


6203 


GTGGGAAGAG 


GGGCAGAGGG 


GATCTGGCAT 


CCATGGGAAG 


CATTTTTCTC 


ATTTATATTC 


6263 


TTTGGGGACA 


CCAGCAGCTC 


CCTGGGAGAC 


AGAAAATAAT 


GGTTCTCCCC 


AGAATGAAAG 


6323 


TCTCTAATTC 


AACAAACATC 


TTCAGAGCAC 


CTACTATTTT 


GCAAGAGCTG 


TTTAAGGTAG 


6383 


TACAGGGGCT 


TTGAGGTTGA 


GAAGTCACTG 


TGGCTATTCT 


CAGAACCCAA 


ATCTGGTAGG 


6443 


GAATGAAATT 


GATAGCAAGT 


AAATGTAGTT 


AAAGAAGACC 


CCATGAGGTC 


CTAAAGCAGG 


6503 


CAGGAAGCAA 


ATGCTTAGGG 


TGTCAAAGGA 


AAGAATGATC 


ACATTCAGCT 


GGGGATCAAG 


6563 


ATAGCCTTCT 


GGATCTTGAA 


GGAGAAGCTG 


GATTCCATTA 


GGTGAGGTTG 


AAG AT GAT GG 


6623 


GAGGTCTACA 


CAGACGGAGC 


AACCATGCCA 


AGTAGGAGAG 


TATAAGGCAT 


ACTGGGAGAT 


6683 
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TAGAAATAAT TACTGTACCT TAACCCTGAG TTTGCGTAGC TATCACTCAC CAATTATGCA 6743 

TTTCTACCCC CTGAACATCT GTGGTGTAGG GAAAAGAGAA TCAGAAAGAA GCCAGCTCAT 6803 

ACAGAGT CCA AGGGTCTTTT GGGATATTGG GTTATGATCA CTGGGGTGTC ATTGAAGGAT 6963 

CCTAAGAAAG GAGGACCACG ATCTCCCTTA TATGGTGAAT GTGTTGTTAA GAAGTTAGAT 6923 

GAGAGGTGAG GAGACCAGTT AGAAAGCCAA TAAGCATTTC CAGATGAGAG ATAATGGTTC 6983 

TTGAAATCCA ATAGTGCCCA GGTCTAAATT GAGATGGGTG AATGAGGAAA ATAAGGAAGA 7043 

GAGAAGAGGC AAGATGGTGC CTAGGTTTGT GATGCCTCTT TCCTGGGTCT CTTGTCTCCA 7103 

CAG GA GGA GCC ATG GGG CAC TAC GTC TTA GCT GAA CGT GAG 714 4 
Arg Gly Ala Met Gly His Tyr Val Leu Ala Glu Arg Glu 
340 345 

TGACACGCAG CCTGCAGACT CACTGTGGGA AGGAGACAAA AC T AG AGAC T CAAAGAGGGA 7 204 

GTGCATTTAT GAGCTCTTCA TGTTTCAGGA GAGAGTTGAA CCTAAACATA GAAATTGCCT 7 2 64 

GACGAACTCC TTGATTTTAG CCTTCTCTGT TCATTTCCTC AAAAAGATTT CCCCATTTAG 7324 

GTTTCTGAGT TCCTGCATGC CGGTGATCCC TAGCTGTGAC CTCTCCCCTG GAACTGTCTC 7 384 

TCATGAACCT CAAGCTGCAT CTAGAGGCTT CCTTCATTTC CTCCGTCACC TCAGAGACAT 7 44 4 

ACACCTATGT CATTTCATTT CCTATTTTTG GAAGAGGACT CCTTAAATTT GGGGGACTTA 7 504 

CAT GAT T CAT TTTAACATCT GAGAAAAGCT TTGAACCCTG GGACGTGGCT AGTCATAACC 7564 

TTACCAGATT TTTACACATG' TATCTATGCA TTTTCTGGAC CCGTTCAACT TTTCCTTTGA 7 62 4 

ATCCTCTCTC TGTGTTACCC AGTAACTCAT CTGTCACCAA GCCTTGGGGA TTCTTCCATC 7 684 

TGATTGTGAT GTGAGTTGCA CAGCTATGAA GGCTGTACAC TGCACGAATG GAAGAGGCAC 77 4 4 

CTGTCCCAGA AAAAGCATCA TGGCTATCTG TGGGTAGTAT GATGGGTGTT TTTAGCAGGT 780 4 

AGGAGGCAAA TATCTTGAAA GGGGTTGTGA AGAGGTGTTT TTTCTAATTG GCATGAAGGT 7864 

GTCATACAGA TTTGCAAAGT TTAATGGTGC CTTCATTTGG GATGCTACTC TAGTATTCCA 7924 

GACCTGAAGA AT C ACAATAA TTTTCTACCT GGTCTCTCCT TGTTCTGATA ATGAAAATTA 7984 

TGATAAGGAT GATAAAAGCA CTTACTTCGT GTCCGACTCT TCTGAGCACC TACTTACATG 904 4 

CATTACTGCA TGCACTTCTT ACAATAATTC TATGAGATAG GTACTATTAT CCCCATTTCT 8104 

TTTTTAAATG AAGAAAGTGA AGTAGGCCGG GCACGGTGGC TCACGCCTGT AATCCCAGCA 8164 

CTTTGGGAGG CCAAAGCGGG TGGATCACGA GGTCAGGAGA TCGAGACCAT CCTGGCTAAC 8224 

ATGGTGAAAC CCCATCTCTA ATAAAAATAC AAAAAATTAG CTGGGCGTGG TGGCAGACGC B284 

CTGTAGTCCC AGCTACTCGG AAGGCTGAGG CAGGAGAATG GCATGAACCC AGGAGGCAGA 8344 

GCTTGCAGTG AGCCGAGTTT GCGCCACTGC ACTCCAGCCT AGGTGACAGA GTGAGACTCC 84 04 
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ATCTCAAAAA AATAAAAATA AAAATAAAAA AATGAAAAAA AAAAGAAAGT GAAGTATAGA 84 64 

GTATCTCATA GTTTGTCAGT GATAGAAACA GGTT TCAAAC TCAGTCAATC TGACCGTTTG 8524 

ATACATCTCA GACACCACTA CATTCAGTAG TTTAGATGCC TAGAATAAAT AGAGAAGGAA 8584 

GGAGATGGCT CTTCTCTTGT CTCATTGTGT TTCTTCTGAG TGAGCTTGAA TCACATGAAG 8 64 4 

GGGAACAGCA GAAAACAACC AACTGATCCT CAGCTGTCAT GTTTCCTTTA AAAGTCCCTG 8704 

AAGGAAGGTC CTGGAATGTG ACTCCCTTGC TCCTCTGTTG CTCTCTTTGG CATTCATTTC 87 64 

TTTGGACCCT ACGCAAGGAC TGTAATTGGT GGGGACAGCT AGTGGCCCTG CTGGGCTTCA 8824 

CACACGGTGT CCTCCCTAGG CCAGTGCCTC TGGAGTCAGA ACTCTGGTGG TATTTCCCTC 8884 

AATGAAGTGG AGTAAGCTCT CTCATTTTGA GATGGTATAA TGGAAGCCAC CAAGTGGCTT 8 944 

AGAGGATGCC CAGGTCCTTC CATGGAGCCA CTGGGGTTCC GGTGCACATT AAAAAAAAAA 90Q4 

TCTAACCAGG ACATTCAGGA ATTGCTAGAT TCTGGGAAAT CAGTTCACCA TGTTCAAAAG 9064 

AGTCTTTTTT TTTTTTTTGA GACTCTAXTG CCCAGGCTGG AGTGCAATGG CATGATCTCG 9124 

GCTCACTGTA ACCTCTGCCT CCCAGGTTCA AGCGATTCTC CTGTCTCAGC CTCCCAAGTA 9X84 

GCTGGGATTA CAGGCGTGCA CCACCATGCC CGGCTAATTT TTGTATTTTT AGTAGAGACA 9244 

GGGTTTCACC ATGTTGGCCA GGCTGGTCTC GAACTCTCCT GACCTCGTGA TCCGCCTGCC 9304 

TCGGCCTCCC AAAGTGCTGA GAf TACAGGT GTGAGCCACC CTGCCCAGCC GTCAAAAGAG 9364 

TCTTAATATA TATATCCAGA TGGCATGTGT TTACTTTATG TTACTACATG CACTTGGCTG 94 24 

CATAAATGTG GTACAAGCAT TCTGTCTTGA AGGGCAGGTG CTTCAGGATA CCATATACAG 9484 

CTCAGAAGTT TCTTCTTTAG GCATTAAATT TTAGCAAAGA TATCTCATCT CTTCTTTTAA 9544 

ACCATTTTCT TTTTTTGTGG T T AG AAAAGT TATGTAGAAA AAAGTAAATG TGATTTACGC 9604 

TCATTGTAGA AAAGCTATAA AATGAATACA ATTAAAGCTG TTATTTAATT AGCCAGTGAA 9664 

AAACTATTAA CAACTTGTCT ATTACCTGTT AGTATTATTG TTGCATTAAA AATGCATATA 9724 

CTTTAATAAA TGTATAT2?GT ATTGTATACT GCATGATTTT ATTGAAGTTC TTGTTCAT CT 97 84 

T GTGT AT AT A CTTAATCGCT TTGTCATTTT GGAGACATTT ATTTTGCTTC TAATTTCTTT 9844 

ACATTTTGTC TTACGGAATA TTTTCATTCA ACTGTGGTAG CCGAATTAAT CGTGTTTCTT 9904 

CACTCTAGGG ACATTGTCGT . CTAAGTTGTA AGACATTGGT TATTTTACCA GCAAACCATT .9964 

CTGAAAGCAT ATGACAAATT ATTTCTCTCT TAATATCTTA CTAT ACT GAA AGCAGACTGC 10024 

TATAAGGCTT CACTTACTCT TCTACCTCAT AAGGAATATG TTACAATTAA TTTATTAGGT 10084 

AAGCATTTGT TTTATATTGG TTTTATTTCA CCTGGGCTGA GATTTCAAGA AACACCCCAG 1014 4 

TCTTCACAG? AACACATTTC ACTAACACAT TTACTAAACA TCAGCAACTG TGGCCTGXTA 10204 

A2TTTTTTAA TAGAAATTTT AAGTCCTCAT TTTCTTTCGG TGTTTTTTAA GCTTAATTTT 10264 
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TCTGGCTTTA 


TTCATAAATT 


CTTAAGGTCA 


ACTACATTTG 


AAAAATCAAA 


GACCTGCATT 


10324 


TTAAATTCTT 


ATTCACCTCT 


GGCAAAACCA 


TTCACAAACC 


ATGGTAGTAA 


AGAGAAGGGT 


10384 


GACACCTGGT 


GGCCATAGGT 


AAATGTACCA 


CGGTGGTCCG 


GTGACCAGAG 


ATGCAGCGCT 


10444 


GAGGGTTrTC 


CTGAAGGTAA 


AGGAATAAAG 


AATGGGTGGA 


GGGGCGTGCA 


CTGGAAATCA 


10504 


CTTGTAGAGA 


AAAGCCCCTG 


AAAATT T GAG 


AAAACAAACA 


AGAAACTACT 


TACCAGCTAT 


10564 


TTGAATTGCT 


GGAATCACAG 


GCCATTGCTG 


AGCTGCCTGA 


ACTGGGAACA 


CAACAGAAGG 


10624 


AAAACAAACC 


ACTCTGATAA 


TCATTGAGTC 


AAGTACAGCA 


GGTGATTGAG 


GACTGCTGAG 


10664 


AGGTACAGGC 


CAAAATTCTT 


ATGTTGTATT 


ATAATAATGT 


CATCTTATAA 


TACTGTCAGT 


10744 


ATTTTATAAA 


ACATTCTTCA 


CAAACTCACA 


C AC AT T T AAA 


AACAAAACAC 


TGTCTCTAAA 


10804 


ATCCCCAAAT 


TTTTCATAAA 


C 








10325 



£2) INFORMATION FOR SEQ ID NO: 8: 

(X) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 34 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Gly Pro Arg Ala Arg Pro Ala Leu Leu Leu Leu 
15 10 

Thr Ala Val Leu Gin Gly Arg Leu Leu Arg Ser His 
20 25 

Leu Phe Met Gly Ala Ser Glu Gin Asp Leu Gly Leu 
35 40 



Met Leu Leu 
15 

Ser Leu His 
30 

Ser Leu Phe 
45 



Ala Leu Gly Tyr Val Asp Asp Gin Leu Phe Val Phe 
50 55 60 



Tyr Asp Asp 
Ser Arg He 

Ser Gin Met Trp Leu Gin Leu Ser Gin Ser Leu Lys Gly Trp Asp 



Ser Arg Arg Val Glu Pro Arg Thr Pro Trp Val Ser 
6S 70 75 



Met Phe Thr Val Asp Phe Trp Thr He Met Glu Asn His Asn His 
100 105 110 

Lys Glu Ser His Thr Leu Gin Val lie Leu Gly Cys Glu Met Gin 
115 120 125 

Asp Asn Ser Thr Glu Gly Tyr Trp Lys Tyr Gly Tyr Asp Gly Gin 
130 135 140 

His Leu Glu Phe Cys Pro Asp Thr Leu Asp Trp Arg Ala Ala Glu 
145 150 155 



Gin 
Tyr 

Glu 

Glu 

Ser 
80 

His 

Ser 

Glu 

Asp 



Pro 
160 
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Arg Ala Trp Pro Thr Lys Leu Glu Trp GLu Arg His Lys He Arg Ala 
165 170 175 

Arg Gin Asn Arg Ala Tyr Leu Glu Arg Asp Cys Pro Ala Gin Leu Gin 
180 185 190 

Gin Leu Leu Glu Leu Gly Arg Gly Val Leu Asp Gin Gin Val Pro Pro 
195 200 205 

Leu Val Lys Val Thr His His Val Thr Ser Ser Val Thr Thr Leu Arg 
210 215 220 

Cys Arg Ala Leu Asn Tyr Tyr Pro Gin Asn He Thr Met Lys Trp Leu 
225 230 235 240 

Lys Asp Lys Gin Pro Met Asp Ala Lys Glu Phe Glu Pro Lys Asp Val 
245 250 255 

Leu Pro Asn Gly Asp Gly Thr Tyr Gin Gly Trp He Thr Leu Ala Val 
260 265 270 

Pro Pro Gly Glu Glu Gin Arg Tyr Thr Tyr Gin Val Glu His Pro Gly 
275 280 285 

Leu Asp Gin Pro Leu He Val He Trp Glu Pro Ser Pro Ser Gly Thr 
290 295 300 

Leu Val He Gly Val He Ser Gly He Ala Val Phe Val Val He Leu 
305 310 315 320 

Phe He Gly He Leu Phe He He Leu Arg Lys Arg Gin Gly Ser Arg 
325 330 335 

Gly Ala Met Gly His Tyr Val Leu Ala Glu Arg Glu 
340 345 



(2) INFORMATION FOR SEQ ID NO: 9; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 144 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS ; single 

(D) TOPOLOGY: linear 

{ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

- (A) NAME /KEY: CDS 
(B) LOCATION: 222.. 1268 



{ix) FEATURE; 

(A) NAME/KEY: allele 

(B) LOCATION: replace (408, ,f c n ) 

{D} OTHER INFORMATION: /phenotype= "normal or wild-type 

(unaffected) " 
/label= 24d2 
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(ix) FEATURE: 

(A) NAME /KEY : allele 

(B) LOCATION: replace (414, "a") 

(D) OTHER INFORMATION ; /phenotype= "normal or wild-type 

(unaffected) " 
/label= 24d7 

(ix) FEATURE: 

(A) NAME /KEY : allele 

(B) LOCATION; replace (1066, "g") 

(D) OTHER INFORMATION: /phenotype= "normal or wild-type 

(unaffected) " 
/label* 24dl 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

GGGGACACTG GATCACCTAG TGTTTCACAA GCAGGTACCT TCTGCTGTAG GAGAGAGAGA 60 

ACTAAAGTTC TGAAAGACCT GTTGCTTTTC ACCAGGAAGT TTTACTGGGC ATCTCCTGAG 120 

CCTAGGCAAT AGCTGTAGGG TGACTTCTGG AGCCATCCCC GTTTCCCCGC CCCCCAAAAG 180 

AAGCGGAGAT TTAACGGGGA CGTGCGGCCA GAGCTGGGGA A ATG GGC CCG CGA 233 

Met Gly Pro Arg 
1 

GCC AGG CCG GCG CTT CTC CTC CTG ATG CTT TTG CAG ACC GCG GTC CTG 281 
Ala Arg Pro Ala Leu Leu Leu Leu Met Leu Leu Gin Thr Ala Val Leu 
5 10 15 20 

CAG GGG CGC TTG CTG CGT TCA CAC TCT CTG CAC TAC CTC TTC ATG GGT 329 
Gin Gly Arg Leu Leu Arg Ser His Ser Leu His Tyr Leu Phe Met Gly 
25 30 35 

GCC TCA GAG CAG GAC CTT GGT CTT TCC TTG TTT GAA GCT TTG GGC TAC 377 
Ala Ser Glu Gin Asp Leu Gly Leu Ser Leu Phe Glu Ala Leu Gly Tyr 
40 45 50 

GTG GAT GAC CAG CTG TTC GTG TTC TAT GAT CAT GAG AGT CGC CGT GTG 425 
Val Asp Asp Gin Leu Phe Val Phe Tyr Asp His Glu Ser Arg Arg Val 
55 60 65 

GAG CCC CGA ACT CCA TGG GTT TCC AGT AGA ATT TCA AGC CAG ATG TGG 473 
Glu Pro Arg Thr Pro Trp Val Ser Ser Arg lie Ser Ser Gin Met Trp 
70 75 80 

CTG CAG CTG AGT CAG AGT CTG AAA GGG TGG GAT CAC ATG TTC ACT GTT 521 
Leu Gin Leu Ser Gin Ser Leu Lys Gly Trp Asp His Met Phe Thr Val 
85 90 95 100 

GAC TTC TGG ACT ATT ATG GAA AAT CAC AAC CAC AGC AAG GAG TCC CAC 569 
Asp Phe Trp Thr lie Met Glu Asn His Asn His Ser Lys Glu Ser His 
105 110 115 

ACC CTG CAG GTC ATC CTG GGC TGT GAA ATG CAA GAA GAC AAC AGT ACC 617 
Thr Leu Gin Val lie Leu Gly Cys Glu Met Gin Glu Asp Asn Ser Thr 
120 125 130 

GAG GGC TAC TGG AAG TAC GGG TAT GAT GGG CAG GAC CAC CTT GAA TTC 665 
Glu Gly Tyr Trp Lys Tyr Gly Tyr Asp Gly Gin Asp His Leu Glu Phe 
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135 140 145 

TGC CCT GAC ACA CTG GAT TGG AGA GCA GCA GAA CCC AGG GCC TGG CCC 713 
Cys Pro Asp Thr Leu Asp Trp Arg Ala Ala Glu Pro Arg Ala Trp Pro 
150 155 160 

ACC AAG CTG GAG TGG GAA AGG CAC AAG ATT CGG GCC AGG CAG AAC AGG 761 
Thr Lys Leu Glu Trp Glu Arg His Lys He Arg Ala Arg Gin Asn Arg 
165 170 175 180 

GCC TAG CTG GAG AGG GAC TGC CCT GCA CAG CTG CAG CAG TTG CTG GAG 809 
Ala Tyr Leu Glu Arg Asp Cys Pro Ala Gin Leu Gin Gin Leu Leu Glu 
185 190 195 

CTG GGG AGA GGT GTT TTG GAC CAA CAA GTG CCT CCT TTG GTG AAG GTG 857 
Leu Gly Arg Gly Val Leu Asp Gin Gin Val Pro Pro Leu Val Lys Val 
200 205 210 

ACA CAT CAT GTG ACC TCT TCA GTG ACC ACT CTA CGG TGT CGG GCC TTG 90S 
Thr His His Val Thr Ser Ser Val Thr Thr Leu Arg Cys Arg Ala Leu 
215 220 225 

AAC TAC TAC CCC CAG AAC ATC ACC ATG AAG TGG CTG AAG GAT AAG CAG 953 
Asn Tyr Tyr Pro Gin Asn He Thr Met Lys Trp Leu Lys Asp Lys Gin 
230 235 240 

CCA ATG GAT GCC AAG GAG TTC GAA CCT AAA GAC GTA TTG CCC AAT GGG 1001 
Pro Mei: Asp Ala Lys Glu Phe Glu Pro Lys Asp Val Leu Pro Asn Gly 
245 250 255 260 

GAT GGG ACC TAC CAG GGC TGG ATA ACC TTG GCT GTA CCC CCT GGG GAA 104 9 

Asp Gly Thr Tyr Gin Gly Trp He Thr Leu Ala Val Pro Pro Gly Glu 
265 270 275 

GAG CAG AGA TAT ACG TGC CAG GTG GAG CAC CCA GGC CTG GAT CAG CCC 1097 
Glu Gin Arg Tyr Thr Cys Gin Val Glu His Pro Gly Leu Asp Gin Pro 
280 285 290 

CTC ATT GTG ATC TGG GAG CCC TCA CCG TCT GGC ACC CTA GTC ATT GGA 114 5 

Leu He Val lie Trp Glu Pro Ser Pro Ser- Gly Thr Leu Val He Gly 
295 300 305 

GTC ATC AGT GGA ATT GCT GTT TTT GTC GTC ATC TTG TTC ATT GGA ATT 1193 
Val He Ser Gly He Ala Val Phe Val Val He Leu Phe He Gly He 
310 315 320 

TTG TTC ATA ATA TTA AGG AAG AGG CAG GGT TCA AGA GGA GCC ATG GGG 1241 
Leu Phe He He Leu Arg Lys Arg Gin Gly Ser Arg Gly Ala Met Gly 
325 330 335 340 

. CAC TAC GTC TTA GCT GAA CGT GAG TGACACGCAG CCTGCAGACT CACTGTGGGA 1295 
His Tyr Val Leu Ala Glu Arg Glu 
345 

AGGAGACAAA ACTAGAGACT CAAAGAGGGA GTGCATTTAT GAGCTCTTCA TGTTTCAGGA 1355 

GAGAGTTGAA CCTAAACATA GAAATTGCCT GACGAACTCC TTGATTTTAG CCTTCTCTGT 1415 

TCATTTCCTC AAAAAGATTT CCCCA 1440 



PAffi 43ff4 * RCVD AT 4C4/2008 1:37:38 PM [Eastern Dayflght Time] 1 SVR:USPT0-EFXRF4/13 * DHIS:27383D0 * CSID:6507393900 f DURATION (mn«s):16-20 



APR-24-2006 10:49AM FROVHONES DAY 



6507393900 



T-631 P. 044/074 F- 



(2) INFORMATION FOR SEQ ID NO: 10: 

{1} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1440 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS; single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 222.. 1268 



(ix) FEATURE: 

(A) NAME /KEY : allele 

(B) LOCATION: replace ( 10 66, "a") 

(D) OTHER INFORMATION: /phenotype^ "Hereditary Hemochromatosis 

{HH) " 

/label= 24dl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

GGGGACACTG GATCACCTAG TGTTTCACAA GCAGGTACCT TCTGCTGTAG GAGAGAGAGA 60 

ACTAAAGTTC TGAAAGACCT GTTGCTTTTC ACCAGGAAGT TTTACTGGGC ATCTCCTGAG 120 

CCTAGGCAAT AGCTGTAGGG TGACTTCTGG AGCCATCCCC GTTTCCCCGC CCCCCAAAAG 180 

AAGCGGAGAT TTAACGGGGA CGTGCGGCCA GAGCTGGGGA A ATG GGC CCG CGA 233 

Met Gly Pro Arg 
1 

GCC AGG CCG GCG CTT CTC CTC CTG ATG CTT TTG CAG ACC GCG GTC CTG 281 
Ala Arg Pro Ala Leu Leu Leu Leu Met Leu Leu Gin Thr Ala Val Leu 
5 10 15 20 

CAG GGG CGC TTG CTG CGT TCA CAC TCT CTG CAC TAC CTC TTC ATG GGT 329 
Gin Gly Arg Leu Leu Arg Ser His Ser Leu His Tyr Leu Phe Met Gly 
25 30 35 

GCC TCA GAG CAG SAC CTT GGT CTT TCC TTG TTT GAA GCT TTG GGC TAC 317 
Ala Ser Glu Gin Asp Leu Gly Leu Ser Leu Phe Glu Ala Leu Gly Tyr 
40 45 50 

GTG GAT GAC CAG CTG TTC GTG TTC TAT GAT CAT GAG AGT CGC CGT GTG 425 
Val Asp Asp Gin Leu Phe Val Phe Tyr Asp His Glu Ser Arg Arg Val 
55 60 65 

GAG CCC CGA ACT CCA TGG GTT TCC AGT AGA ATT TCA AGC CAG ATG TGG 4"? 3 

Glu Pro Arg Thr Pro Trp Val Ser Ser Arg He Ser Ser Gin Met Trp 
70 75 80 

CTG CAG CTG AGT CAG AGT CTG AAA GGG TGG GAT CAC ATG TTC ACT GTT 521 
Leu Gin Leu Ser Gin Ser Leu Lys Gly Trp Asp His Met Phe Thr Val 
85 90 95 100 

GAC TTC TGG ACT ATT ATG GAA AAT CAC AAC CAC AGC AAG GAG TCC CAC 569 
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Asp Phe Trp Thr lie Met Glu Asn His Asn His Ser Lys Glu Ser His 
105 110 115 

ACC CTG CAG GTC ATC CTG GGC TGT GAA ATG CAA GAA GAC AAC AGT ACC 61 7 

Thr Leu Gin Val lie Leu Gly Cys Glu Met Gin Glu Asp Asn Ser Thr 
120 125 130 

GAG GGC TAC TGG AAG TAC GGG TAT GAT GGG CAG GAC CAC CTT GAA TTC 665 
Glu Gly Tyr Trp Lys Tyr Gly Tyr Asp Gly Gin Asp His Leu Glu Phe 
135 140 145 

TGC CCT GAC ACA CTG GAT TGG AGA GCA GCA GAA CCC AGG GCC TGG CCC 713 
Cys Pro Asp Thr Leu Asp Trp Arg Ala Ala Glu Pro Arg Ala Trp Pro 
150 155 1G0 

ACC AAG CTG GAG TGG GAA AGG CAC AAG ATT CGG GCC AGG CAG AAC AGG 761 
Thr Lys Leu Glu Trp Glu Arg His Lys lie Arg Ala Arg Gin Asn Arg 
165 170 17S 180 

GCC TAC CTG GAG AGG GAC TGC CCT GCA CAG CTG CAG CAG TTG CTG GAG B09 
Ala Tyr Leu Glu Arg Asp Cys Pro Ala Gin Leu Gin Gin Leu Leu Glu 
185 190 195 

CTG GGG AGA GGT GTT TTG GAC CAA CAA GTG CCT CCT TTG GTG AAG GTG 857 
Leu Gly Arg Gly Val Leu Asp Gin Gin Val Pro Pro Leu Val Lys Val 
200 205 210 

ACA CAT CAT GTG ACC TCT TCA GTG ACC ACT CTA CGG TGT CGG GCC TTG 905 
Thr His His Val Thr Ser Ser Val Thr Thr Leu Arg Cys Arg Ala Leu 
215 220 225 

AAC TAC TAC CCC CAG AAC ATC ACC ATG AAG TGG CTG AAG GAT AAG CAG 953 
Asn Tyr Tyr Pro Gin Asn lie Thr Met Lys Trp Leu Lys Asp Lys Gin 
230 235 240 

CCA ATG GAT GCC AAG GAG TTC GAA CCT AAA GAC GTA TTG CCC AAT GGG 1001 
Pro Met Asp Ala Lys Glu Phe Glu Pro Lys Asp Val Leu Pro Asn Gly 
245 250 255 260 

GAT GGG ACC TAC CAG GGC TGG ATA ACC TTG GCT GTA CCC CCT GGG GAA 104 9 

Asp Gly Thr Tyr Gin Gly Trp lie Thr Leu Ala Val Pro Pro Gly Glu 
265 270 275 

GAG CAG AGA TAT ACQ TAC CAG GTG GAG CAC CCA GGC CTG GAT CAG CCC 1097 
Glu Gin Arg Tyr Thr Tyr Gin Val Glu His Pro Gly Leu Asp Gin Pro 
230 285 290 

CTC ATT GTG ATC TGG GAG CCC TCA CCG TCT GGC ACC CTA GTC ATT GGA 1145 
Leu lie Val He Trp Glu Pro Ser Pro Ser Gly Thr Leu Val He Gly 
295 300 305 

GTC ATC AGT GGA ATT GCT GTT TTT GTC GTC ATC TTG TTC ATT GGA ATT 1193 
Val He Ser Gly He Ala Val Phe Val Val He Leu Phe He Gly He 
310 315 320 

TTG TTC ATA ATA TTA AGG AAG AGG CAG GGT TCA AGA GGA GCC ATG GGG 1241 
Leu Phe He He Leu Arg Lys Arg Gin Gly Ser Arg Gly Ala Met Gly 
325 330 335 340 

CAC TAC GTC TTA GCT GAA CGT GAG TGACACGCAG CCTGCAGACT CACTGTGGGA 12 95 

His Tyr Val Leu Ala Glu Arg Glu 



PAGE 45/74 ' RCVDAT 4/24/2006 1:37:38 PM [Eastern Daylight Time] * SVR;USPTMFXRF<5/13 • DlflS:2738300 ' CSU):6507393900 ' DURATION (mm-ss):16-20 



APR-24-2006 10 :49AM FROM- J ONES DAY 



6507393900 



T-G31 P. 046/074 F- 



345 

AGGAGACAAA ACTAGAGACT CAAAGAGGGA GTGCATTTAT GAGCTCTTCA TGTTTCAGGA 1355 

GAGAGTTGAA CCTAAACATA GAAATTGCCT GACGAACTCC TTGATTTTAG CCTTCTCTGT 1415 

TCATTTCCTC AAAAAGATTT CCCCA 14 40 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1440 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
CD) TOPOLOGY: linear 

Ui) MOLECULE TYPE: cDNA 



(ix) FEATURE; 

(A) NAME/KEY: CDS 

CB) LOCATION: 222.. 1268 



(ix) FEATURE: 

(A) NAME/KEY: allele 

{B) location: replace (408, "g") 

(D) OTHER INFORMATION: /phenotype~ "Hereditary Hemochromatosis 

(HH) " 

/label= 24<32 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11 : 

G6GGACACTG GAT CAC C TAG TGTTTCACAA GCAGGTACCT TCTGCTGTAG GAGAGAGAGA 60 

ACTAAAGTTC TGAAAGACCT GTTGCTTTTC ACCAGGAAGT TTTACTGGGC ATCTCCTGAG 120 

CCTAGGCAAT AGCTGTAGGG TGACTTCTGG AGCCATCCCC GTTTCCCCGC CCCCCAAAAG 180 

AAGCGGAGAT TTAACGGGGA CGTGCGGCCA GAGCTGGGGA A ATG GGC CCG CGA 233 

Met Gly Pro Arg 
1 

GCC AGG CCG GCG CTT CTC CTC CTG ATG CTT TTG CAG ACC GCG GTC CTG 281 
Ala Arg Pro Ala Leu Leu Leu Leu Met Leu Leu Gin Thr Ala Val Leu 
5 10 15 20 

GAG GGG CGC TTG CTG CGT TCA CAC TCT CTG CAC TAC CTC TTC ATG GGT 329 
Gin Gly Arg Leu Leu Arg Ser His Ser Leu His Tyr Leu Phe Met Gly 
25 30 35 

GCC TCA GAG CAG GAC CTT GGT CTT TCC TTG TTT GAA GCT TTG GGC TAC 37 7 

Ala Ser Glu Gin Asp Leu Gly Leu Ser Leu Phe Glu Ala Leu Gly Tyr 
40 45 50 

GTG GAT GAC CAG CTG TTC GTG TTC TAT GAT GAT GAG AGT CGC CGT GTG 425 
Val Asp Asp Gin Leu Phe Val Phe Tyr Asp Asp Glu Ser Arg Arg Val 
55 60 65 
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GAG CCC CGA ACT CCA TGG GTT TCC AGT AGA ATT TCA AGC CAG ATG TGG 47 3 

Glu Pro Arg Thr Pro Trp Val Ser Ser Arg He Ser Ser Gin Met Trp 
70 75 80 

CTG CAG CTG AGT CAG AGT CTG AAA GGG TGG GAT CAC ATG TTC ACT GTT 521 
Leu Gin Leu Ser Gin Ser Leu Lys Gly Trp Asp His Met Phe Thr Val 
85 90 95 100 

GAC TTC TGG ACT ATT ATG GAA AAT CAC AAC CAC AGC AAG GAG TCC CAC 5 69 

Asp Phe Trp Thr lie Mee Glu Asn His Asn His Ser Lys Glu Ser His 
105 110 115 

ACC CTG CAG GTC ATC CTG GGC TGT GAA ATG CAA GAA GAC AAC AGT ACC 617 
Thr Leu Gin Val He Leu Gly Cys Glu Met Gin Glu Asp Asn Ser Thr 
120 125 130 

GAG GGC TAC TGG AAG TAG GGG TAT GAT GGG CAG GAC CAC CTT GAA TTC 665 
Glu Gly Tyr Trp Lys Tyr Gly Tyr Asp Gly Gin Asp His Leu Glu Phe 
135 140 145 

TGC CCT GAC ACA CTG GAT TGG AGA GCA GCA GAA CCC AGG GCC TGG CCC 713 
Cys Pro Asp Thr Leu Asp Trp Arg Ala Ala Glu Pro Arg Ala Trp Pro 
150 155 160 

ACC AAG CTG GAG TGG GAA AGG CAC AAG ATT CGG GCC AGG CAG AAC AGG 761 
Thr Lys Leu Glu Trp Glu Arg His Lys He Arg Ala Arg Gin Asn Arg 
165 170 175 130 

GCC TAC CTG GAG AGG GAC TGC CCT GCA CAG CTG CAG CAG TTG CTG GAG 809 
Ala Tyr Leu Glu Arg Asp Cys Pro Ala Gin Leu Gin Gin Leu Leu Glu 
185 190 195 

CTG GGG AGA GGT GTT TTG GAC CAA CAA GTG CCT CCT TTG GTG AAG GTG 857 
Leu Gly Arg Gly Val Leu Asp Gin Gin Val Pro Pro Leu Val Lys Val 
200 205 210 

ACA CAT CAT GTG ACC TCT TCA GTG ACC ACT CTA CGG TGT CGG GCC TTG 90S 
Thr His His Val Thr Ser Ser Val Thr Thr Leu Arg Cys Arg Ala Leu 
215 220 225 

AAC TAC TAC CCC CAG AAC ATC ACC ATG AAG TGG CTG AAG GAT AAG CAG 953 
Asn Tyr Tyr Pro Gin Asn He Thr Met Lys Trp Leu Lys Asp Lys Gin 
230 235 240 

CCA ATG GAT GCC AAG GAG TTC GAA CCT AAA GAC GTA TTG CCC AAT GGG 1001 
Pro Met Asp Ala Lys Glu Phe Glu Pro Lys Asp Val Leu Pro Asn Gly 
245 250 255 260 

GAT GGG ACC TAC CAG GGC TGG ATA ACC TTG GCT GTA CCC CCT GGG GAA 104 9 

Asp Gly Thr Tyr Gin Gly Trp He Thr Leu Ala Val Pro Pro Gly Glu 
265 270 275 

GAG CAG AGA TAT ACG TGC CAG GTG GAG CAC CCA GGC CTG GAT CAG CCC 1097 
Glu Gin Arg Tyr Thr CyS Gin Val Glu His Pro Gly Leu Asp Gin Pro 
280 285 290 

CTC ATT GTG ATC TGG GAG CCC TCA CCG TCT GGC ACC CTA GTC ATT GGA 1145 
Leu He Val He Trp Glu Pro Ser Pro Ser Gly Thr Leu Val He Gly 
295 300 305 

GTC ATC AGT GGA ATT GCT GTT TTT GTC GTC ATC TTG TTC ATT GGA ATT 1193 
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Val He Ser Gly He Ala Val Phe Val Val He Leu Phe He Gly He 
310 315 320 

TTG TTC ATA ATA TTA AGG AAG AGG CAG GGT TCA AGA GGA GCC ATG GGG 124 3. 

Leu Phe He He Leu Arg Lys Arg Gin Gly Ser Arg Gly Ala Met Gly 
325 330 335 340 

CAC TAG GTC TTA GCT GAA CGT GAG TGACACGCAG CCTGCAGACT CACTGTGGGA 1295 
His Tyr Val Leu Ala Glu Arg Glu 
345 

AGGAGACAAA ACTAGAGACT CAAAGAGGGA GTGCATTTAT GAGCTCTTCA TGTTTCAGGA 1355 

GAGAGTTGAA CCTAAACATA GAAATTGCCT GACGAACTCC TTGATTTTAG CCTTCTCTGT 1415 

TCATTTCCTC AAAAAGATTT CCCCA 14 40 



[2) INFORMATION FOR SEQ ID NO:12; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 144 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY: CDS 

(B) LOCATION: 222,-1268 



(ix) FEATURE; 

(A) NAME/KEY: allele 

(B) LOCATION: replace (408, "g") 

<D) OTHER INFORMATION: /phenotype= "Hereditary Hemochromatosis 

(HH) " 

/label« 24d2 

(ix) FEATURE: 

(A) NAME /KEY: allele 

<B) LOCATION: replace ( 1066 f "a") 

(D) OTHER INFORMATION: /phenotype= "Hereditary Hemochromatosis 

(HH) " 

/label= 24dl 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



GGGGACACTG 


GATCACCTAG 


TGTTTCACAA 


GCAGGTACCT 


TCTGCTGTAG 


GAGAGAGAGA 


60 


ACTAAAGTTC 


TGAAAGACCT 


GTTGCTTTTC 


ACCAGGAAGT 


TTTACTGGGC 


ATCTCCTGAG 


120 


CCTAGGCAAT 


AGCTGTAGGG 


TGACTTCTGG 


AGCCATCCCC 


GTTTCCCCGC 


CCCCCAAAAG 


180 


AAGCGGAGAT 


TTAACGGGGA 


CGTGCGGCCA 


GAGCTGGGGA 


A ATG GGC CCG CGA 


233 



Met Gly Pro Arg 
1 
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GCC AGG CCG GCG CTT CTC CTC CTG ATG CTT TTG CAG ACC GCG GTC CTG 281 
Ala Arg Pro Ala Leu Leu Leu Leu Met Leu Leu Gin Thr Ala Val Leu 
5 10 15 20 

CAG GGG CGC TTG CTG CGT TCA CAC TCT CTG CAC TAC CTC TTC ATG GGT 32 9 

Gin Gly Arg Leu Leu Arg Ser His Ser Leu His Tyr Leu Phe Met Gly 
25 30 35 

GCC TCA GAG CAG GAC CTT GGT CTT TCC TTG TTT GAA GCT TTG GGC TAC 377 
Ala Ser Glu Gin Asp Leu Gly Leu Ser Leu Phe Glu Ala Leu Gly Tyr 
40 45 50 

GTG GAT GAC CAG CTG TTC GTG TTC TAT GAT GAT GAG AGT CGC CGT GTG 425 
Val Asp Asp Gin Leu Phe Val Phe Tyr Asp Asp Glu Ser Arg Arg Val 
55 60 65 

GAG CCC CGA ACT CCA TGG GTT TCC AGT AGA ATT TCA AGC CAG ATG TGG 4 73 

Glu Pro Arg Thr Pro Trp Val Ser Ser Arg lie Ser Ser Gin Met Trp 
70 75 80 

CTG CAG CTG AGT CAG AGT CTG AAA GGG TGG GAT CAC ATG TTC ACT GTT 521 
Leu Gin Leu Ser Gin Ser Leu Lys Gly Trp Asp His Met Phe Thr Val 
85 90 95 100 

GAC TTC TGG ACT ATT ATG GAA AAT CAC AAC CAC AGC AAG GAG TCC CAC 569 
Asp Phe Trp Thr lie Met Glu Asn His Asn His Ser Lys Glu Ser His 
105 110 115 

ACC CTG CAG GTC ATC CTG GGC TGT GAA ATG CAA GAA GAC AAC AGT ACC 617 
Thr Leu Gin Val He Leu Gly CyS Glu Met Gin Glu Asp Asn Ser Thr 
120 125 130 

GAG GGC TAC TGG AAG TAC GGG TAT GAT GGG CAG GAC CAC CTT GAA TTC 665 
Glu Gly Tyr Trp Lys Tyr Gly Tyr Asp Gly Gin Asp His Leu Glu Phe 
135 140 145 

TGC CCT GAC ACA CTG GAT TGG AGA GCA GCA GAA CCC AGG GCC TGG CCC 713 
Cys Pro Asp Thr Leu Asp Trp Arg Ala* Ala Glu Pro Arg Ala Trp Pro 
150 155 160 

ACC AAG CTG GAG TGG GAA AGG CAC AAG ATT CGG GCC AGG CAG AAC AGG 7 61 

Thr Lys Leu Glu Trp Glu Arg His Lys He Arg Ala Arg Gin Asn Arg 
165 170 . 175 180 

GCC TAC CTG GAG AGG GAC TGC CCT GCA CAG CTG CAG CAG TTG CTG GAG 809 
Ala Tyr Leu Glu Arg Asp Cys Pro Ala Gin Leu Gin Gin Leu Leu Glu 
185 190 195 

CTG GGG AGA GGT GTT TTG GAC CAA CAA GTG CCT CCT TTG GTG AAG GTG 857 
Leu Gly Arg Gly Val Leu Asp Gin Gin Val Pro Pro Leu Val Lys Val 
200 205 210 

ACA CAT CAT GTG ACC TCT TCA GTG ACC ACT CTA CGG TGT CGG GCC TTG 903 
Thr His His Val Thr Ser Ser Val Thr Thr Leu Arg Cys Arg Ala Leu 
215 220 225 

AAC TAC TAC CCC CAG AAC ATC ACC ATG AAG TGG CTG AAG GAT AAG CAG 953 
Asn Tyr Tyr Pro Gin Asn He Thr Met Lys Trp Leu Lys Asp Lys Gin 
230 235 240 

CCA ATG GAT GCC AAG GAG TTC GAA CCT AAA GAC GTA TTG CCC AAT GGG 1001 
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PrO Met A3p Ala Ly3 Glu Phe Glu Pro Lys Asp Val Leu Pro Asn Gly 
245 250 255 260 

GAT GGG ACC TAC CAG GGC TGG ATA ACC TTG GCT GTA CCC CCT GGG GAA 104 9 

Asp Gly Thr Tyr Gin Gly Trp He Thr Leu Ala Val Pro Pro Gly Glu 
265 270 275 

GAG CAG AGA TAT ACG TAC CAG GTG GAG CAC CCA GGC CTG GAT CAG CCC 1097 
Glu Gin Arg Tyr Thr Tyr Gin Val Glu His Pro Gly Leu Asp Gin Pro 
280 285 290 

CTC ATT GTG ATC TGG GAG CCC TCA CCG TCT GGC ACC CTA GTC ATT GGA 114 5 

Leu lie Val lie Trp Glu Pro Ser Pro Ser Gly Thr Leu Val lie Gly 
295 300 305 

GTC ATC AGT GGA ATT GCT GTT TTT GTC GTC ATC TTG TTC ATT GGA ATT 1193 
val He Ser Gly He Ala Val Phe Val Val He Leu Phe He Gly He 
310 315 320 

TTG TTC ATA ATA TTA AGG AAG AGG CAG GGT TCA AGA GGA GCC ATG GGG 1241 
Leu Phe He He Leu Arg Lys Arg Gin Gly Ser Arg Gly Ala Mec Gly 
325 330 335 340 

CAC TAC GTC TTA GCT GAA CGT GAG TGACACGCAG CCTGCAGACT CACTGTGGGA 1295 
His Tyr Val Leu Ala Glu Arg Glu 
345 

AGGAGACAAA ACTAGAGACT CAAAGAGGGA GTGCATTTAT GAGCTCTTCA TGTTTCAGGA 1355 

GAGAGTTGAA CCTAAACATA GAAATTGCCT GACGAACTCC TTGATTTTAG CCTTCTCTGT 1415 

TCATTTCCTC AAAAAGATTT CCCCA 14 40 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
tC) STRANDEDNESS : single 
(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
TGGCAAGGGT AAACAGATCC 20 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
{BJ TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
CTCAGGCACT CCTCTCAACC 20 

(2) INFORMATION FOR SEQ ID NO; 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic aeid 

(C) STRANDEDNESS: single 
ED) TOPOLOGY: linear 

Cii) MOLECULE TYPE: DNA 

<ix) FEATURE: 

(A) NAME /KEY : modi f ied_base 

(B) LOCATION: 1 

( D) OTHER INFORMATION: /mod_base^ OTHER 

/note= "N =» 5 1 -biotinylared guanine 
(bio-G) " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 : 
NGAAGAGCAG AGATATACGT G 21 

(2) INFORMATION FOR SEQ ID NO:16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii> MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) name /KEY: modif ied_base 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /mod_base=a OTHER 

/not:e= "N = 5 ' -biotinylared guanine 
(bio-G) " 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
NGAAGAGCAG AGATATACGT A 21 

(2) INFORMATION FOR SEQ ID NO; 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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Ui) MOLECULE TYPE: DNA 

(ix) FEATURE: 

(A) NAME/KEY: modif iedjaase 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /modJbase= OTHER 

/note= "N = 5 '-phosphorylated cytosine 
(P-C) » 

(ix) FEATURE: 

(A) NAME/KEY: modi f lecMoase 

(B) LOCATION: 18 

CD) OTHER INFORMATION: /mod_base= OTHER 

/note= "N = 3 1 -digoxigenin-conj ugated 
guanine (G-dig) " 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
NCAGGTGGAG CACCCAGN 18 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
CTGAAAGGGT QGGATCACAT 20 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
CAAGGAGTTC GTCAGGCAAT 20 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 517 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY: - 

(B) LOCATION: 1..517 

(Dj other information: /note- "normal or wild-type (unaffected) 

genomic sequence surrounding variant for 
24dl{G} allele corresponding to positions 
5507-6023 of genomic sequence containing 
the HH gene <SEQ ID NO: 1) " 

(ix) FEATURE: 

(A) NAME /KEY : allele 

(B) LOCATION: replace (328, "g") 

(D) OTHER INFORMATION: /phenotype« "normal or wild-type 

(unaffected) " 
/la£el= 24dl 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 



TATTTCCTTC 


CTCCAACCTA 


TAGAAGGAAG 


TGAAAGTTCC 


AGTCTTCCTG 


GCAAGGGTAA 


60 


ACAGATCCCC 


TCTCCTCATC 


CTTCCTCTTT 


CCTGTCAAGT 


GCCTCCTTTG 


GTGAAGGTGA 


120 


CACATCATGT 


GACCTCTTCA 


GTGACCACTC 


TACGGTGTCG 


GGCCTTGAAC 


TACTACCCCC 


180 


AGAACATCAC 


CATGAAGTGG 


CTGAAGGATA 


AGCAGCCAAT 


GGATGCCAAG 


GAGTTCGAAC 


240 


CTAAAGACGT 


ATTGCCCAAT 


GGGGATGGGA 


CCTACCAGGG 


CTGGATAACC 


TTGGCTGTAC 


300 


CCCCTGGGGA 


AGAGCAGAGA 


TATACGTGCC 


AGGTGGAGCA 


CCCAGGCCTG 


GATCAGCCCC 


360 


TCATTGTGAT 


CTGGGGTATG 


TGACTGATGA 


GAGCCAGGAG 


CTGAGAAAAT 


CTATTGGGGG 


420 


TTGAGAGGAG 


TGCCTGAGGA 


GGTAATTATG 


GCAGTGAGAT 


GAGGATCTGC 


TCTTTGTTAG 


430 


GGGGTGGGCT 


GAGGGTGGCA 


ATCAAAGGCT 


TTAACTT 






517 



(2) INFORMATION FOR SEQ ID NO:2l: 

(i) SEQUENCE CHARACTERISTICS: 

(A} LENGTH: 517 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: - 

(3) LOCATION: 1. ,517 

(D) OTHER INFORMATION: /note= "genomic sequence surrounding 

variant for 24dl(A) allele corresponding 
to positions 5507-6023 of genomic 
sequence containing the HH gene 
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(SEQ ID NO: 3) " 

(ix) FEATURE: 

(A) NAME /KEY : allele 

(B) LOCATION: replace (328, "a") 

(D) OTHER INFORMATION: /phenotype= "Hereditary Hemochromatosis 

(HH) " 

/label- 24dl 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 



TATTTCCTTC 


CTCCAACCTA 


TAGAAGGAAG 


TGAAAGTTCC 


AGTCTTCCTG 


GCAAGGGTAA 


60 


ACAGATCCCC 


TCTCCTCATC 


CTTCCTCTTT 


CCTGTCAAGT 


GCCTCCTTTG 


GTGAAGGTGA 


120 


CACATCATGT 


GACCTCTTCA 


GTGACCACTC 


TACGGTGTCG 


GGCCTTGAAC 


TACTACCCCC 


ISO 


AGAACATCAC 


CATGAAGTGG 


CTGAAGGATA 


AGCAGCCAAT 


GGATGCCAAG 


GAGTTCGAAC 


240 


CTAAAGACGT 


ATTGCCCAAT 


GGGGATGGGA 


CCTACCAGGG 


CTGGATAACC 


TTGGCTGTAC 


300 


CCCCTGGGGA 


AGAGCAGAGA 


TATACGTACC 


AGGTGGAGCA 


CCCAGGCCTG 


GATCAGCCCC 


360 


TCATTGTGAT 


CTGGGGTATG 


TGACTGATGA 


GAGCCAGGAG 


CTGAGAAAAT 


CTATTGGGGG 


420 


TTGAGAGGAG 


TGCCTGAGGA 


GGTAATTATG 


GCAGTGAGAT 


GAGGATCTGC 


TCTTTGTTAG 


460 


GGGGTGGGCT 


GAGGGTGGCA 


ATCAAAGGCT 


TTAACTT 






517 



(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

{A J LENGTH: 361 amino acids 
{B} TYPE : amino acid 
<C) STRANDEDNESS: 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(ix) FEATURE: 

(A) NAME /KEY: Protein 

(B) LOCATION: 1..361 

(D) OTHER INFORMATION: /noce^ "Rabbit leukocyte antigen (RLA) " 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

Met Gly Ser lie Pro Pro Arg Thr Leu Leu Leu Leu Leu Ala Gly Ala . 
15 10 15 

Leu Thr Leu Lys Asp Thr Gin Ala Gly Ser His Ser Met Arg Tyr Phe 
20 25 30 

Tyr Thr Ser Val Ser Arg Pro Gly Leu Gly Glu Pro Arg Phe lie lie 
35 40 45 

Val Gly Tyr Val Asp Asp Thr Gin Phe Val Arg Phe Asp Ser Asp Ala 
50 55 60 
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Ala Ser Pro Arg Met Glu Gin Arg Ala Pro Trp Met Gly Gin Val Glu 
65 70 75 80 

Pro Glu Tyr Trp Asp Gin Gin Thr Gin lie Ala Lys Asp Thr Ala Gin 
85 90 95 

Thr Phe Arg Val Asn Leu Asn Thr Ala Leu Arg Tyr Tyr Asn Gin Ser 
100 105 110 

Ala Ala Gly Ser His Thr Phe Gin Thr Met Phe Gly Cys Glu Val Trp 
115 3,20 125 

Ala Asp Gly Arg Phe Phe His Gly Tyr Arg Gin Tyx Ala Tyr Asp Gly 
130 135 140 

Ala Asp Tyr He Ala Leu Asn Glu Asp Leu Arg Ser Trp Thr Ala Ala 
145 150 155 160 

Asp Thr Ala Ala Gin Asn Thr Gin Arg Lys Trp Glu Ala Ala Gly Glu 
165 170 175 

Ala Glu Arg His Arg Ala Tyr Leu Glu Arg Glu Cys Val Glu Trp Leu 
180 185 190 

Arg Arg Tyr Leu Glu Met Gly Lys Glu Thr Leu Gin Arg Ala Asp Pro 
195 200 205 

Pro Lys Ala His Val Thr His His Pro Ala Ser Asp Arg Glu Ala Thr 
210 215 220 

Leu Arg Cys Trp Ala Leu Gly Phe Tyr Pro Ala Glu He Ser Leu Thr 
225 230 235 240 

Trp Gin Arg Asp Gly Glu Asp Gin Thr Gin Asp Thr Glu Leu Val Glu 
245 250 255 

Thr Arg Pro Gly Gly Asp Gly Thr Phe Gin Lys Trp Ala Ala Val Val 
260 265 270 

Val Pro Ser Gly Glu Glu Gin Arg Tyr Thr Cys Arg Val Gin His Glu 
275 280 2S5 

Gly Leu Pro Glu Pro Leu Thr lieu Thr Trp Glu Pro Pro Ala Gin Pro 
290 295 300 

Thr Ala Leu He Val Gly He Val Ala Gly Val Leu Gly Val Leu Leu 
305 310 315 320 

He Leu Gly Ala Val Val Ala Val Val Arg Arg Lys Lys His Ser Ser 
325 • 330 335 

Asp Gly Lys Gly Gly Arg Tyr Thr Pro Ala Ala Gly Gly His Arg Asp 
340 345 350 

Gin Gly Ser Asp Asp Ser Leu Met Pro 
355 360 



(2) INFORMATION FOR SEQ ID NO: 23: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 365 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(ix) FEATURE: 

(A) NAME /KEY : Protein 

(B) LOCATION: 1..365 

(D) OTHER INFORMATION: /note= "Human Major Histocompatability 

Class I (MHC) protein" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Met Ala val Met Ala Pro Arg Thr Leu Val Leu Leu Leu Ser Gly Ala 
15 10 15 

Leu Ala Leu Thr Gin Thr Trp Ala Gly Ser His Ser Met Arg Tyr Phe 
20 25 30 

Phe Thr Ser Val Ser Arg Pro Gly Arg Gly Glu Pro Arg Phe II© Ala 
35 40 45 

Val Gly Tyr Val Asp Asp Thr Gin Phe val Arg Phe Asp Ser Asp Ala 
50 55 60 

Ala Ser Gin Arg Met Glu Pro Arg Ala Pro Trp He Glu Gin Glu Gly 
65 70 75 80 

Pro Glu Tyr Trp Asp Gly Glu Thr Arg Lys Val Lys Ala His Ser Gin 
85 90 { 95 

Thr His Arg Val Asp Leu Gly Thr Leu Arg Gly Tyr Tyr Asn Gin Ser 
100 105 110 

GlU Ala Gly Ser His Thr Leu Gin Met Met Phe Gly Cys Asp Val Gly 
115 120 125 

Ser Asp Trp Arg Phe Leu Arg Gly Tyr His Gin Tyr Ala Tyr Asp Gly 
130 135 140 

Lys Asp Tyr lie Ala Leu Lys Glu Asp Leu Arg Ser Trp Thr Ala Ala 
145 150 155 160 

Asp Met Ala Ala Gin Thr Thr Lys His Lys Trp Glu Ala Ala His Val 
165 170 175 

Ala Glu Gin Leu Arg Ala Tyr Leu Glu Gly Thr Cys Val Glu Trp Leu 
180 185 190 

Arg Arg Tyr Leu Glu Asn Gly Lys Glu Thr Leu Gin Arg Thr Asp Ala 
195 200 205 

Pro Lys Thr His Met Thr His His Ala Val Ser Asp His Glu Ala Thr 
210 215 220 

Leu Arg Cys Trp Ala Leu Ser Phe Tyr Pro Ala Glu lie Thr Leu Thr 
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225 230 23S 240 

Trp Gin Arg Asp Gly Glu Asp Gin Thr Gin Asp Thr Glu Leu Val Glu 
245 250 255 

Thr Arg Pro Ala Gly Asp Gly Thr Phe Gin Lys Trp Ala Ala Val Val 
260 265 270 

Val Pro Ser Gly Gin Glu Gin Arg Tyr Thr Cys His Val Gin His Glu 
275 280 285 

Gly Leu Pro Lys Pro Leu Thr Leu Arg Trp Glu Pro Ser Ser Gin Pro 
290 295 300 

Thr lie Pro lie Val Gly He He Ala Gly Leu Val Leu Phe Gly Ala 
305 310 315 320 

Val Tie Thr Gly Ala Val Val Ala Ala Val Met Trp Arg Arg Lys Ser 
325 330 335 

Ser Asp Arg Lys Gly Gly S®r Tyr S©r Gin Ala Ala Ser Ser Asp Ser 
340 345 350 

Ala Gin Gly Ser Asp Val Ser Leu Thr Ala Cys Lys Val 
355 360 365 

(2) INFORMATION FOR S£Q ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 
ACATGGTTAA GGCCTGTTGC 20 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
GCCACATCTG GCTTGAAATT 20 



(2) INFORMATION FOR SEQ ID NO: 26: 
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<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(ix) FEATURE; 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N = S'-biotinylated adenine 
(bio-A) " 



(XX) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
NGCTGTTCGT GTTCTATGAT C 21 



(2) INFORMATION FOR SEQ ID NO: 27: 
(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(3) TYPE: nucleic acid 

(C) STRANDEDNESS; single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME /KEY : modified base 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N = 5' -biotinylated adenine 
(bio-A) " 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
NGCTGTTCGT GTTCTATGAT G 21 



(2) INFORMATION FOR SEQ ID NO: 28: 

<i} SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 



(ix) FEATURE: 

(A) NAME /KEY : modif iedjbase 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /modjbase- OTHER 

/note- "N = 5 * -phosphorylated adenine 
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<p-A) » 

(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 19 

(D) OTHER INFORMATION: /mod_base= OTHER 

/noce= "N - 3 '-digoxigenin-conjugated 
adenine (A-dig) " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
NTGAGAGTCG CCGTGTGGN 19 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: 3ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA ( genomic) 

(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 29: 
GGAAGAGCAG AGATATACGT GCCAGGTGGA GCACCCAGG 39 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
GGAAGAGCAG AGATATACGT ACCAGGTGGA GCACCCAGG 39 

(2) INFORMATION FOR SEQ ID NO; 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
CAAAAGAAGC GGAGATTTAA CG 22 
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(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 19 base pairs 
(B J TYPE: nucleic acid 
(C) STRANDEDNESS: single 
(D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
AGATTTAACG GGGACGTGC 19 

(2) INFORMATION FOR SEQ ID NO: 33: 

(X) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33: 
AGAGGTCACA TGATGTGTCA CC 22 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS; 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: 
AGGAGGCACT TGTTGGTCC 19 



(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
AAAATCACAA CCACAGCAAA G 21 

(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acici 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
TTCCCACAGT GAGTCTGCAG 20 

(2) INFORMATION FOR SEQ ID N0:37: 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE; nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: 
CAATGGGGAT GGGACCTAC 19 

(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
ATATACGTGC CAGGTGGAGC 20 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
CD) TOPOLOGY: linear 
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(ii) MOLECULE TYPE; DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
CCTCTTCACA ACCCCTTTCA 20 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
CATAGCTGTG CAACTCACAT CA 22 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
AGCTGTTCGT GTTCTATGAT CATGAGAGTC GCCGTGTGGA 4 0 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 40 base pairs 
(S) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
AGCTGTTCGT GTTCTATGAT GATGAGAGTC GCCGTGTGGA 4 0 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA {genomic} 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43; 
TGTTCTATGA TCATGAGAGT CGCCGTGTGG AG 32 



(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 
TGTTCTATGA TGATGAGTGT CGCCGTGTGG AG 32 
(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 
ATATACGTGC CAGGTGG 17 
(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 
CO STRANDEDNESS: single 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
ATATACGTAC CAGGTGG 17 
(2) INFORMATION FOR SEQ ID NO: 47; 

(i) SEQUENCE CHARACTERISTICS: 

[A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

[D] TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
TCTATGATCA TGAGAGT 17 
(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: 
TCTATGATGA TGAGAGT 17 
(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:49: 

TGCGTGCTCC ACCTGGC 17 

(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 17 base pairs 
(B5 TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
TGGGTGCTCC ACCTGGT 17 
(2) INFORMATION FOR SEQ ID NO; Si: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic! 

fxi) SEQUENCE DESCRIPTION: SEQ ID NO: 51; 
CACACGGCGA CTCTCATG 18 
(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

CACACGGCGA CTCTCATC 18 

(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 20 base pairs 
(BJ TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE; 

(A) NAME/KEY; modif iedjoase 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /mod_base- OTHER 

/note= "N = fluorescein-labeled guanine" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 
NGAAGAGCAG AGATATACGT 20 
(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /mod_base- OTHER 

/note** "N - fluorescein-labeled guanine" 



PAGE 65/74 1 RCVDAT 4124/2008 1:37:38 PM [Eastern Daylight Time] ' SVR:USPTO-EFXRF-5/13 ' DNIS:2733300 ' CSID:6507393900 ' DURATION (mm-s$):16-20 



APR-24-200G 10:53AM FROM- J ONES DAY 



6507393900 



T-S31 P. 066/074 F- 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 
NGCCTGGGTG CTCCACCTGG 20 
(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 1 ~~ 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N = f luorescein-labeled arginine" 



(xi) SEQUENCE DESCRIPTION; SEQ ID NO:55: 
NGCTGTTCGT GTTCTATGAT 20 
(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 bade pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /rood_base= OTHER 

/note= "N = f luorescein-labeled cytosine" 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

NTCCACACGG CGACTCTCAT 20 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 15 base pairs 
(9) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:57: 

GCTCCACCTG GCACG 15 

(2) INFORMATION FOR SEQ ID NO:58:* 

<i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 15 base pairs 
(B> TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



GCTCCACCTG GTACG 15 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 
{A) LENGTH : 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 
GCGACTCTCA TCATC 15 
(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 
GCGACTCTCA TGATC 15 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) ' TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY : modified base 
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(B) LOCATION: 1 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N = 5 ' -biotinylated guanine 
(bio-G)" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NQ:6l: 
NCCTGGGTGC TCCACCTGGC 20 
(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pair© 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE : 

(A) NAME /KEY: modi f ied_base 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /mOd_base= OTHER 

/note= "N = S'-phosphorylated adenine 
(p-A) " 

(ix) FEATURE: 

(A) NAME /KEY: mcdif ied_base 
<B) LOCATION: 20 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N = 3 ' -digoxigenin-conjuga'ced 
cytosine (C^dig) " 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 
NCGTATATCT CTGCTCTTCN 20 
(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(iX) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N = 5 '-biotinylated guanine 
(bio-G) " 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 
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NAAGAGCAGA GATATACGTG 20 
{2} INFORMATION FOR SEQ ID'NO;64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note- "N - 5 '-phosphorylated cytosine 
(P-C)" 

(ix) FEATURE: 

(A) NAME/KEY; modif iedjaase 

(B) LOCATION: 20 " 

(D) OTHER INFORMATION; /Tnod_base= OTHER 

/note- "N a 3 1 -digoxigenin-conjugated 
cytokine (C-dig) " 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
NCAGGTGGAG CACCCAGGCN 20 
(2) INFORMATION FOR SEQ ID NO; 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /mOd_baSe= OTHER 

/note= "N = 5 ' -biotinylated guanine 
(bio-G) " 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 
NCCTGGGTGC TCCACCTGGT 20 
(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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<D) TOPOLOGY: linear 
<ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: modi f ied_base 

(B) LOCATION: J. 

CD) OTHER INFORMATION: /mod_base« OTHER 

/note= "N = 5 T -phosphoryleted adenine 
(p-A) " 

(ix) FEATURE: 

(A) NAME /KEY: raodif ied_base 

(B) LOCATION; 20 

(D) OTHER INFORMATION: /modjbase*. OTHER 

/noce= "N - 3 ' -digoxigenin-conjugated 
cycosine lC-dig) n 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
NCGTATATCT CTGCTCTTCN 20 
(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY : modified base 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N - 5 ' -biotinylated guanine 
(bio-G> " 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
NAAGAGCAGA GATATACGTA 20 
(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 1 
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(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N - S'-phosphorylated cy^osina 
(P-C) n 

(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION; 20 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N = 3 ' -digoxigenin-con jugaced 
cytokine (C-dig) " 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
NCAGGTGGAG CACCCAGGCN 20 
(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



£ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /mod_base~ OTHER 

/note- "N — 5 ' -biotinylated thymine 
(bio-T) " 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 
NCCACACGGC GACTCTCATG 20 
(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : Single 

(D) topology: linear 

(ii) MOLECULE TYPE; DNA (genomic) 



(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /mOd_base= OTHER 

/note« "N = 5' -phosphorylated adenine 
(p-A) " 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_base 

(B) LOCATION; 20 

(D> OTHER INFORMATION: /mod_base= OTHER 
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/node^ "N = 3 ' -digoxigenin-conjugated 
thymine (T-dig) " 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 
NTCATAGAAC ACGAACAGCN 20 
(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA [genomic) 



(ix) FEATURE: 

(A) NAME /KEY : modif ied_foase 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /mod_base= OTHER 

/noterf "N « 5 ' -biotinylated guanine 
(bio-G) n 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
NCTGTTCGTG TTCTATGATC 20 
(2) INFORMATION FOR SEQ ID NO:72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE : 

(A) NAME /KEY: raodif ied_base 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N = S'-phosphorylated adenine 
(P-A) " 



(ix) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 20 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= n N = 3 f -digoxigenin-conjugai:ed 
guanine (G-dig) " 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:72: 
NTGAGAGTCG CCGTGTGGAN 20 



PAGE 72/74 1 RCVD AT 4/24/2006 1:37:38 PM [Eastern Daylight Time] 1 SVItUSPTO-EFXRF-6/13 t DfOS:2738300 1 C$ID:6507393900 * DURATION (mi!KS):16-20 



APR-24-200G 10:54AM FROM- J ONES DAY 



6507303900 



T-831 P. 073/074 F-909 



(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 
CA) LENGTH: 20 base pairs 
(B ) TYPE: nucleic acid 

CC) STRANDEDNESS: single 

CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA {genomic) 



(ix) FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N = 5 1 -biotinylated thymine 
(bio-T) " 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:73: 
MCCACACGGC GACTCTCATC 20 
[2) INFORMATION FOR SEQ ID NO: 74: 

Ci) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



FEATURE: 
(A) NAME /KEY : modif ied_base 
<B) LOCATION: 1 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N = 5 ' -phosphorylated adenine 
Cp-A) " 

FEATURE: 

(A) NAME /KEY: modif ied_base 

(B) LOCATION: 20 

(D) OTHER INFORMATION: /modJbase= OTHER 

/note= ,? N ^ 3 '-digoxigenin-conjugated 
thymine {T-dig) ,f 



SEQUENCE DESCRIPTION: SEQ ID NO: 74: 
NTCATAGAAC ACGAACAGCN 
(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: Single 
(0) TOPOLOGY: linear 



(ix) 



(ix) 



(xi) 
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(ii) MOLECULE TYPE: DNA (genomic) 



(XX) FEATURE: 

(A) NAME/KEY: modif ied_base 

(B) LOCATION: 1 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N =» S'-biotinylaced guanine 
(bio-G) » 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
NCTGTTCGTG TTCTATGATG 20 
(2) INFORMATION FOR SEQ ID NO: 76: 

£i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNES5: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

<A) NAME/KEY: modif ied_base 
(B) LOCATION: 1 

(D) OTHER INFORMATION: /mod_base= OTHER 

/note= "N = 5 '-phosphorylated adenine 
(P-A) " 

(ix) FEATURE: 

(A) NAME /KEY : modif ied_bas© 

(B) LOCATION: 20 

<D) OTHER INFORMATION: /mod_base« OTHER 

/not<a= "N = 3 '-digoxigenin-conjugated 
guanine (G-dig) " 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 
NTGAGAGTCG CCGTGTGGAN 20 
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